Background
For many diseases, estimates of incidence and prevalence are often incomplete or based on different data sources making it difficult to compare results [
1]. Reliable and representative population level epidemiological data are needed to inform health care policy and to support decision making processes in health service planning and delivery, and are essential to cost effectiveness analyses and burden of disease calculations. Due to gaps in directly measured data, models have been established that can estimate incidence and prevalence rates of diseases. DisMod II is such a model. It uses available epidemiological data about a condition to estimate missing data on incidence, prevalence, remission and case fatality rates as applicable [
2,
3]. Originally developed for the Global Burden of Disease studies [
4], DisMod II is freely available for use and can be downloaded from
http://www.epigear.com/index.htm.
The DisMod II model is a multistate life table that fully describes the epidemiological progress of a single disease by exploiting the fact that parameters such as incidence, prevalence, remission, case fatality and mortality rates are not independent variables. By solving a set of differential equations, Dismod II can estimate age-specific incidence, prevalence or case fatality rates for a disease, given sufficient data on the other (for example, with input data of age-specific prevalence, case fatality and mortality data for a disease, Dismod II will estimate the age-specific incidence rate for the disease). The model operates by calculating the number of people in each of three states: healthy, diseased and dead at any age. Within the model, there are two causes of death, either from the disease or from ‘all other’ causes, that are assumed to be independent. There are four transition hazards which are age specific (assumed to be constant within a 1-year age interval): incidence, remission, case fatality, and the “all other mortality” hazard. The input data for the model are age and sex-specific estimates of three out of the four parameters described above for a given population, and a complete set of parameters (smoothed from the original or estimated from the original parameters) is the output of the model.
Ischaemic heart disease (IHD) is the most common cause of death in the UK [
5] and acute myocardial infarction (AMI) is coded on death certificates as the cause of approximately one third of all deaths from IHD [
6]. AMI mortality and prevalence data for having had an AMI in England are routinely collected by the Office of National Statistics (ONS) and the Health Survey for England (HSfE) [
7] respectively. However, until recently, there have been no published comprehensive, population-based national level estimates of AMI incidence [
8] and these recent estimates are unlikely to be routinely updated. Incidence of AMI is important to researchers and public health policy makers because it serves as an indicator of the effectiveness of preventative measures and management of risk factors through health promotion and other public health initiatives. Without a tool that allows routinely updated estimates of incidence, measurements of the current burden of AMI in England have significant limitations.
This study assesses the external validity of DisMod II estimates of the incidence of first AMI in England by comparison with estimates generated from the dataset used to support a recent series of related papers [
8‐
10]. Establishing the external validity of the modelled estimates would demonstrate that the DisMod II model could be used as a tool for regularly updating estimates of the incidence of AMI—data that are not routinely collected in England. It would also help to establish confidence in studies of non-communicable disease that use DisMod II to estimate incidence, such as in modelling studies [
11], and studies estimating disease burden where incidence data are scarce or not regularly updated [
12,
13]. Since the 2010 iteration, the Global Burden of Disease study [
4] results have been based on an updated (but closely related) version of the DisMod model, which is not freely available for use. Assessments of the external validity of DisMod II offer insights into the assumptions used for this important and widely used global project.
Discussion
This study assessed the external validity of DisMod II estimates of the age-specific incidence rate of AMI in England in 2010. Although the modelled estimates and the external dataset resulted in incidence rates in the whole population that were of similar magnitude, the age-specific rates were not consistent with the external dataset; they over-estimated rates in younger age groups and under-estimated rates in the oldest age groups. Incorporating trends in the incidence and case fatality of AMI in the DisMod II estimates resulted in estimated and observed AMI incidence being more closely matched at younger age groups than without trends but with more divergent results at older ages, including for one set of results the implausible scenario of women in their 70s having a lower incidence of AMI than women in their 60s. This study implies that DisMod II is not an appropriate source of age-specific estimates of the incidence of AMI for England and future estimates should be based on measured outcomes.
Modelled estimates of the burden of disease in different populations are extremely important for policy makers and health care planners. They allow for an assessment of where to direct scarce resources in terms of treatment and care, and assist in planning for future health care requirements. They can also be used as the basis for comparing the burden of disease that is attributable to different behavioural risk factors (e.g. the global comparable risk assessment exercise, Global Burden of Disease [
3,
15], or modelling studies to assess the effectiveness or cost-effectiveness of public health interventions [
11,
16,
17] which in turn influences how public health resources are directed).
The DisMod II model is the most recent freely available version of the model that is used for the Global Burden of Disease project and it is currently being used by non-communicable disease scenario models [
18] which have been built in order to estimate the population health impact of public health interventions. For example, Cecchini et al. (2010) [
11] used DisMod II to estimate the incidence of various cardiovascular diseases and risk factors when simulating the possible effects on health of different diet and physical activity interventions. Previous studies have also used DisMod II to estimate disease incidence from prevalence, remission, and disease specific mortality rates where incidence data are scarce. For example, Johnston et al. (2009) [
13] estimated the global stroke burden and used DisMod II to estimate stroke incidence from mortality and case fatality data for countries where only partial data exist. Rehm et al. (2009) [
12] used prevalence, relative risk of mortality, and remission rates to estimate the global incidence of alcohol-use disorders.
Wherever possible, the validity of the DisMod II modelled estimates should be assessed by comparison with actual measured data from the population of interest. This study utilised recent results estimating the incidence of AMI in England using linked hospital episode statistics and death certificates, which captures the vast majority of incident AMIs that occur in England [
9] and as such represents a robust dataset for assessment of external validity. The present study complies with all of the ‘best practices’ identified in the ISPOR modelling guidelines for external validation studies [
19].
The DisMod II software provides users with a variety of choices about how the input data should be manipulated before the outputs are calculated. These choices include: the method used to interpolate input data to single year estimates (cubic spline or polynomial methods); the shape of the curve to fit the smoothed age-related input data (linear, quadratic, sigmoid, simple exponent or polynomial); and whether or not to allow for time trends in the input data. These three choices alone would generate twenty different sets of results, which we did not choose to display—rather we chose the settings that best suited the epidemiology of AMI. The alternative of using crude input data was not preferable due to inherent problems with the crude data. For example, the prevalence data used in the analyses were taken from the Health Survey for England [
7], where rates are reported for 10 year age groups. Using the crude data would lead to large step changes in prevalence as age increases, which resulted in implausible shapes to the modelled incidence data. In practice, changing the specific selections for manipulating the input data did not improve the comparison between the DisMod II estimates and the external dataset. We decided to report the results both for when trends in incidence and case fatality were applied and when they were not as cardiovascular disease rates are decreasing rapidly in the UK and have been for some time [
20]. This is important for the DisMod II model, as the method used for solving the differential equations assumes a ‘steady state’ for the disease being modelled (i.e. that age-specific prevalence, incidence and mortality rates for the disease are static). This allows DisMod II to assume that the prevalence of AMI at age t equals the prevalence of AMI at age t-1 plus incident cases minus dying cases. But input data are all taken from the same year (y), whereas data for age t-1
should be taken from y-1. This is not a problem if the data in year y-1 are equal to those in year y, but if there are trends in the data (is the case for falling AMI incidence rates), this is not the case. For AMI, the incidence and case fatality rate trend data that we applied were taken from the British Regional Heart Study, a cohort study carried out in British men aged 40–59 at entry between 1978 and 2000 [
14]. This study only examines trends in coronary heart disease in men, however other studies report declines of similar magnitude in both men and women [
21,
22]. A sensitivity analysis which applied more recent data that were specific to both men and women did not improve the validity of the modelled results. It is not possible to apply age-stratified trends in DisMod II and therefore the same trends were applied across all age groups and the analyses reported here. However, whilst overall there has been a decline in incidence rates, this decline varies by age with the lowest rate of decline occurring in those aged 85 and over [
9]. Both the mis-match between the external and modelled datasets and reported declines in AMI incidence are age-specific, making this a likely candidate for the failure of the model to produce externally valid estimates of the incidence of AMI in England. However, without further investigation using a model that can incorporate age-specific trends in incidence and mortality it is not possible to prove this assertion.
Another important limitation of our validity assessment is our input data for mortality. The ideal data for the DisMod II model would be estimates of the increased all-cause mortality rate for people who have previously had a heart attack. We were unable to find direct measures, so we used data on all deaths where AMI was indicated as either the primary cause or a contributing factor. This accounts for the fact that mortality rates from AMI are higher in those that have previously had a heart attack but may not include all increased mortality risk for other conditions (e.g. increased risk of pneumonia) [
10,
23].
We found three other studies that compared outputs from the DisMod II software with measured epidemiological data. Manuel et al. (2007) [
24] used AMI incidence data from linked hospital records and death certificates data to estimate prevalence of having had an AMI in Ontario, Canada and compared these modelled prevalence rates with estimates derived from a population health survey. The DisMod estimates for both men and women were very similar to those derived from the population health survey, and were within the 95 % confidence intervals. However, the authors did not report on age-specific estimates of prevalence, so it is unclear whether the estimates from the two sources showed similar age trajectories. Saha et al. (2008) [
25] compared estimates of prevalence and incidence of schizophrenia derived from DisMod II with paired incidence and prevalence estimates from 15 identified studies. They found that the DisMod II estimates of prevalence were generally higher than those identified in the studies and the estimates of incidence were generally lower, but no age-specific modelled prevalence or incidence rates were reported. One third of the modelled estimates were within 50 % of the estimates from the identified studies. Kruijshaar et al. (2002) [
26] compared DisMod estimates of prevalence for breast, prostate, colorectal and stomach cancer with cancer registry data from the Netherlands. Age-specific prevalence estimates were similar to observed rates for colorectal and stomach cancer, but considerably higher for prostate and breast cancer (for some ages modelled estimates were two and three times higher than measured rates, respectively). In all three studies the authors suggested that inadequate description of trends in the studied disease limits the accuracy of the modelled estimates. Given the findings from these studies and the results presented here and the ongoing use of DisMod II in epidemiological modelling studies, it is important that the external validity of DisMod II be further examined with different disease outcomes in different populations.
Another potential source of error in our analyses is the accuracy of the estimates of prevalence of having had AMI from the HSfE. Although the HSfE series is broadly representative of the English population, it does not include residential care home settings in its sample structure. In 2011, around 260,000 people aged 75 and over lived in residential care homes (about 6 % of this age group in England) [
27]. Since the incidence rates for AMI are highest in older age groups, this omission may introduce bias for this study. Also, the estimates from the HSfE are based on self report, which could underestimate true rates.
In the absence of measured epidemiological data, the DisMod II model can provide estimates of the incidence of AMI, which may be helpful for health researchers, health care planners and policy makers. Our research suggests that estimates for England may be broadly accurate when applied to the whole population, but can conceal large inaccuracies when studied by age group. One reason for this inaccuracy is the ‘steady state’ assumption of DisMod II and as such the model should be used with caution when estimating the burden of diseases that are changing rapidly within the target population.
Acknowledgements
We acknowledge the work of Charlotte Boughton and Premila Webster on earlier versions of this paper.