Introduction
During the COVID-19 pandemic near-real time surveillance of excess mortality has been an essential tool for detection of increased and unexpected mortality in many countries [
1‐
5]. It is sensitive both to direct and indirect effects of the pandemic and is not dependent on COVID-19 testing patterns [
6]. It provides information for managing the timing and extent of COVID-19 containment measures, planning for increased demands placed on health services, and remains critical for monitoring longer-term impacts of the pandemic on specific causes of mortality [
7].
Excess mortality compares observed mortality with expected mortality, where expected mortality is commonly estimated using historic mortality rates. However, the pandemic has resulted in prolonged periods of mortality in excess of the number expected across the entire population. Without substantive periods of respite from waves of infection there has not been a ‘catch-up period’ long enough for deaths to return to the level expected based on pre-pandemic trends. Therefore, estimates of the expected number of deaths (and therefore excess deaths) beyond the first wave of the pandemic are increasingly unreliable for several challenging reasons, some of which we address in this paper:
1.
Deaths due to COVID-19 occurred among individuals who would otherwise have been expected to live longer [
8].
2.
People with co-morbidities were more likely than others to die early following COVID-19 infection. Reported mortality rates from these causes (e.g., acute coronary syndrome) will be lower, both during and after the pandemic, than they would have been in the absence of the COVID-19 pandemic [
9,
10].
3.
Conversely, some elective surgery or other types of treatment for co-morbidities were postponed during the pandemic, reducing the number of deaths expected to have occurred as a consequence of the surgery during the pandemic. However, delays in treatment might lead to higher long-term mortality from the affected conditions [
11‐
13].
4.
Disruption to normal life, during what has proved to be a lengthy pandemic, is likely to have affected levels of mortality and made the use of historic rates less reliable in estimating expected numbers of deaths. This disruption includes mobility restrictions, suppression of non-COVID-19 infections such as influenza, on-going public reticence to engage with health services, and the unknown effects of surviving COVID-19 infection on long-term mortality risk [
14,
15].
For reasons 1–2, estimates of expected mortality are distorted by substantive and sometimes complex ‘mortality displacement’ due to COVID-19. Establishing an accurate prediction of expected deaths is crucial for routine surveillance as well as monitoring the ongoing and longer-term impacts of the pandemic. In this paper we focus on the displacement of mortality experienced by those who had a positive COVID-19 test result.
To date, population level model-based approaches that aim to estimate short term displacement of mortality following an exogenous shock have typically involved the use of Poisson or Quasi-Poisson non-linear time-lag models [
16,
17]. However, the extent to which an individual's mortality is displaced is dependent on their underlying risk of mortality prior to infection. Furthermore, in the COVID-19 pandemic, unlike a community-wide temperature shock in which the whole population is ‘at risk’, it is only once an individual has contracted COVID-19 that they suffer a markedly increased risk of their death being brought forward. Thus, an effective analysis must focus on the mortality of
individuals in relation to rapid changes in their underlying risk profile with time and none of the previous methods naturally extend to encompass this possibility. Furthermore, the complex setting of repeated waves of COVID-19 infection affecting different individuals makes it difficult to apply any simple interpolation method.
In this paper, we use an exemplar dataset of patients from the English National Hip Fracture Database (NHFD) to adopt a Cox-regression-based methodology using time-dependent covariates to estimate the profile of the enhanced risk of death across time in individuals who contracted COVID-19 [
18,
19]. This general approach can be applied to any adverse event-based outcome (death being just one example) that follows any ‘at-risk’ defining event (here it just happens to be developing COVID-19), to investigate impacts by time or by cause.
The hip fracture population represents an ideal cohort in which to study mortality displacement for several reasons. Firstly, hip fracture represents one of the most common serious injuries in older people following which the mortality rate is high [
20], which we anticipate will produce relatively short mortality displacement estimate times falling within the follow-up time-period available for Cox modelling – i.e. a large number of individuals who sustained a hip fracture could be expected to have died during the available follow-up time, regardless of whether they contracted COVID-19. Secondly, most individuals sustaining a hip fracture will be treated as a hospital inpatient where the risk of nosocomial exposure to COVID-19 was high, and we can have greater confidence in accurately capturing the timing of early COVID-19 infection compared to a community-based cohort, particularly during the first wave of the pandemic where community testing was not established. Thirdly, data for hip fracture patients in England are collected mandatorily by the NHFD as part of The Falls and Fragility Fracture Audit Programme (FFFAP), commissioned by the Healthcare Quality Improvement Partnership (HQIP) and managed by the Royal College of Physicians (RCP). As such, a wealth of individual level data is available (allowing for risk adjustment) for this cohort through linkage of NHFD data to established national data sources which are utilised in this study. Whilst we fully acknowledge that the mortality displacements seen in this population are not generalisable to the full English population, the hip fracture population is used as an ‘exemplar’ to demonstrate the end-to-end application of the methods presented.
Discussion
Using a national dataset of hip fracture patients as an illustrative dataset, we describe and demonstrate a novel and generalizable method for estimating the parameters underpinning mortality displacement consequent upon COVID-19 positive test. We then use these parameters to adjust expected mortality, and thereby excess mortality (the difference between the number observed and the number expected in any time period), to facilitate on-going surveillance of excess mortality during the remaining pandemic and beyond.
Our results, based on the hip fracture dataset, indicate that estimating the current expected number of deaths based solely on historical trends is an overestimate because many of the deaths expected on this basis would already have occurred earlier in the pandemic. This means that, over any period well after the start of the pandemic, calculating the excess deaths based on historical trends will be an underestimate. These both represent direct consequences of what may be called mortality displacement or harvesting.
Although our results provide clear evidence of short-term mortality displacement in the population included in the national hip fracture registry, interpretation of its magnitude needs to take account of the fact that this is a select subgroup of the general population. Fractures that impede mobility in the weeks that follow the accident/surgery considerably shorten life expectancy – at least in the elderly – and are in themselves markers of significant frailty [
13,
28]. In our analysis we have demonstrated that age, sex, and frailty all have substantial impacts on the frequency and magnitude of mortality displacement. This implies, given that our ultimate aim is to generate quantitative estimates of the effect of mortality displacement in the general population, that it is important that we apply these methods to a representative sample of the national population. The critical importance of the analyses reported in the current methodological paper is to demonstrate the methodology, working through it from end-to-end, and giving us the confidence to engage in the major task of applying them to national population data.
In interpreting the results of our analyses, it is individuals with low frailties and the largest displacements that have almost no impact on short term estimates of expected deaths (
e.g. deaths over the next 12 months). This is because, with reference to Fig.
3, an individual that died of COVID-19 in a given week
\(t\), who would have been expected to live for a further 520 weeks (10 years) under the counterfactual, would only impact on the count of expected deaths at
\(t\)+520 weeks. In contrast, the subpopulations in which mortality displacement has the greatest effect on current, past, and near-future estimates of expected mortality are older people with large frailties in whom statistical power is high, because there are so many deaths. Therefore, among this group, the magnitude of displacement can be estimated more precisely. Although the magnitude of displacement in younger, low-frailty subpopulations can only be estimated with low precision, the precise magnitude is less important when adjusting current expected mortality estimates, because there is considerable certainty that individuals in this group would have survived well beyond the end of the prediction interval.
These observations have important implications for the use of our method in practice. First, it is adequately powered for the short-term adjustment of expected mortality. Specifically, the estimates that are generated are more precise in frailer elderly subgroups because there are more deaths, but the lack of precision in younger fitter populations will appropriately be reflected in wide confidence intervals which should guard against over-interpretation. Second, estimates of the hazard ratios (relative risks) associated with COVID-19 infection and their profile over time after infection are estimated with precision. This means these hazard ratios are useful as epidemiological metrics that can feed directly into our understanding of the public health implications of COVID-19. Third, however, it is less useful for what might be another apparently attractive use of our method—to estimate years of life lost (YLL). In the
\(i\)th individual this is obtained directly as
\({\Delta }_{i}\) (see above). Because this is so straightforward and given that YLL is
“a frequently used population health metric, originating back to the 1940s … and the idea is appealingly simple
”,
7 it sounds like an ideal use of method. But YLL requires very careful interpretation. It is true that by applying our method across all age groups and summing the
\({\Delta }_{i}\) one could generate an overall point estimate of YLL, and by restricting this to people who died of COVID-19 and dividing the total by the number of individuals one could obtain a point estimate for average YLL per COVID-19 death to compare to other equivalent estimates [
29‐
32]. However, this would be of limited value. In subpopulations that are old or frail we can generate estimates of YLL with acceptable precision, and this could provide a useful public health metric of the impact of the pandemic in those frailer subpopulations. But in younger fitter subpopulations although we can confidently state that the relatively small number of individuals who died of COVID-19 lost many years of life, any formal quantification of those YLL would be too imprecise to be of value. Rather, in younger populations it is more sensible to estimate YLL using the “WHO Standard Approach” [
30], based on applying an appropriate standard life table to people who died of COVID-19 [
29]. We therefore believe that our methods, particularly once they have been applied to a nationally representative population, can provide a useful contribution to a description of YLL in older and frailer subpopulations. But we would not recommend they be used, on their own, to attempt to generate an estimate for the average YLL per COVID-19 death across the entire population which is better obtained in other ways [
29‐
32].
The application of Cox hazards models with time-varying covariates to simulate survival times has been validated previously [
33]. Model-based approaches – such as ours – have been used previously to explore short and long-term displacement of mortality caused by exogenous events such as extreme temperatures, air pollution and influenza seasons. These have generally adopted deterministic and stochastic lag models [
16,
17,
34], based on treating mortality at a population level as a Poisson (or Quasi-Poisson) outcome [
35]. Analogous population-level approaches have been used in publications looking at the extent of mortality displacement that had occurred leading up to the pandemic and the subsequent impact on pandemic mortality and indicate some level of displacement [
8,
35‐
39]. In England and France short-term mortality displacement has been estimated at a population level by comparing the number of deaths above the expected value to those below the expected value within the given time period [
8,
38]. In a USA study, no mortality displacement was identified, however, among other issues, as the study period ended in May 2020, this may be a result of the short follow-up [
35].
One method which allows for the estimate of displacement at an individual level to estimate years of life lost [
40], takes the distribution of excess mortality by age group and applies the number of years of life expectancy lost in each group [
41]. This method has been applied to data in Sweden and Norway, with the authors making adjustments to expected mortality [
41]. The study estimated YLL attributable to COVID-19 in 2020 being 45,850 without adjustment for pre-pandemic seasonal influenza mortality displacement and 43,073 when adjusted for displacement. In Scotland, the number of YLL was estimated to approximately 15 per COVID-19 death in 2020 [
42]. In the USA, authors used projections of life expectancy under different COVID-19 scenarios to identify variation in outcomes between age and ethnic groups [
43]. For example, in a medium COVID-19 scenario the estimated reduction of life expectancy at birth in 2020 was approximately 1.13 years with large ethnic disparities – life expectancy for Black and Latino groups was estimated to be reduced by 2.10 and 3.05 years respectively, compared with 0.68 years for White groups. However, these methodologies are limited to analysis among groups for whom life expectancy is calculated and therefore is unable to account for the complex interactions between co-morbidities and COVID-19 infection and risk of death. They, therefore, also assume that those who contract COVID-19 are a random subset of the population; yet evidence suggests a higher susceptibility of COVID-19 infection amongst more vulnerable subgroups [
44,
45].
Limitations
We present the model in this paper as proof of concept, though it can be extended to include as much detail on risk factors as is available within the data source. It does have some limitations – it ultimately depends on accurately identifying those testing positive for COVID-19 infection, which is problematic when analysing mortality for early pandemic periods when, for many countries there was limited testing. This may also become more problematic as the method is applied to later pandemic periods where national policy decisions on testing impact the ability to collect accurate testing numbers. The method does not address displacement among those dying with COVID-19 but without a positive test, mortality that occurred because of indirect effects of the COVID-19 pandemic either on the health system, such as limited access to services, or other effects of containment measures. In both cases these groups will have been included among the non-COVID-19 deaths, resulting in an underestimate of mortality displacement. It also does not address the problem of disruption to historic trends for other reasons, therefore, does not attempt to estimate what would have occurred in the absence of the pandemic. Several approaches can be applied in parallel to address many of these questions. Theoretically, the approach depends on knowing all key variables, which is, as in all modelling situations, impossible to obtain. Additionally, estimation of the extent that deaths are deferred is limited to what is observed within the follow-up period. Our “ball and urn” method for adjusting expected deaths using counts of registered deaths where COVID-19 is mentioned on the death certificate (as opposed to formal COVID-19 testing) may be an over or underestimate of the true excess mortality of COVID-19, due to the representativeness of COVID-19 as the true underlying or contributory cause of death. Finally, we were unable to derive confidence intervals at each stage and in aggregate due to constraints on computational resources. We understand that the application of COVID-19 mortality displacement to adjust conventional expected deaths models and the concept of excess mortality may be undertaken using different methods and further investigative work is required.
As in any model-based analysis, these limitations should be considered in interpreting results. However, compared with other complex modelling scenarios the data we have used are informationally very rich and should undoubtedly provide a useful approximation to what is happening in reality. Furthermore, if one focuses on the method—as a way to build and use a platform to provide information that may be used to help guide public health intervention in a wide variety of ways during and following on from a substantial population-level public health shock—the limitations we highlight provide useful guidelines for how such platforms should be set up and evolve.
Implications and conclusions
Directly observing individuals within a population over time, monitoring events of interest and applying the methodology we have described has applications for answering a multitude of potential public health questions. We describe and apply the approach to temporal displacement of all-cause mortality. But it can equally well be applied to the displacement of cause-specific death and to adjustment of excess mortality associated with specific causes. This will allow better understanding of longer-term impacts of COVID-19 on important causes of death. The methods can also be extended to any adverse outcome (death is simply one example) following any ‘at-risk’ defining event (the example here being developing COVID-19). Furthermore, by assuming everybody in the population is ‘at risk’ from the moment a pre-defined event occurs, death (or another metric) can be tracked as the primary outcome. In all situations such as these, which include extreme temperature events, it is possible to estimate the long-term impact of any population-level event on any outcome – thereby providing critical information for public health planning.
Throughout the pandemic it has become increasingly apparent that there is a worldwide need to link individual-level datasets from various sources to create population-level platforms. Setting up country-wide cohorts starting at prespecified times and recording critical events of interest, could provide critical information for the purposes of public health surveillance, planning and intervention. Using data from such platforms, our work demonstrates that by applying the extended Cox model and subsequently simulating the expected outcomes in both factual and counterfactual scenarios it would be possible to answer a multitude of questions (e.g., at this time, the impact of COVID-19 infection on long-term comorbidity, health service utilisation and other health outcomes). The benefits of having a population platform such as this, and applying analytic methods, including modelling, at the individual-level offers numerous benefits. Using whole populations with individual-level data has the benefit of greater precision and richer mathematical models which provide a more robust approach to addressing the real challenges that are faced by public health science as well as nations themselves. We believe that the ability to generate critical information to inform policy decisions and health planning, as well as help manage major public health shocks, would make an investment in an infrastructure based on pseudonymised data to track real-time, pan-population health events worthwhile given that the scientific and social returns that greatly exceed its costs.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.