In this article, we showed that the impact and extent of misclassification of disease status at death is driven by incidence of the chronic disease of interest.
Summary
Cross-sectional interview-based surveys can be used for the collection of information on the existence of chronic diseases in a population. However, if these interviews are used to obtain information about an individual’s disease status at death, misclassification is possible because study participants are interviewed only once and surveys do not capture that some disease-free participants may develop a chronic disease before death. We conducted a simulation study to assess the extent and direction of this possible misclassification bias and how this bias is influenced by the incidence of the chronic disease of interest. Therefore, our simulation study was based on a high incidence (type 2 diabetes) and a low incidence disease (lupus erythematosus) with populations consisting of 100 000 individuals that were transiting through an illness-death model with Healthy, Diseased and Death (final state for every individual) as possible states. A total of 200 sub-populations with 5000 individuals were randomly drawn from these populations for each chronic disease. For every individual, a random age of study participation was simulated; diagnosis at an age greater than this age remained unseen with a misclassification as non-diseased. For every population, MRR was evaluated without and with possible misclassification of disease status at death (MicDaD). We compared median (with 2.5% and 97.5% quantiles) MRRs in the simulated populations without and with MicDaD. Misclassification of disease status at death led to underestimated MRRs for chronic diseases with a high incidence (such as type 2 diabetes). For low-incidence chronic disease (such as lupus erythematosus) MicDaD caused lower to no bias in the estimation of MRR. This was the first study that investigated the impact of misclassification of disease status at death on MRRs.
Interpretation
Analysis of simulated data in the high incidence setting showed a gap between MRR of populations without and with MicDaD. Populations with MicDaD had smaller values with higher differences at younger ages and smaller values of MRR when MicDaD occurred. As the values of MRR shifted towards 1, underestimation of the MRR was detected in the high incidence setting with misclassification of disease status at death for some individuals in a population. Since extent or number of undetected diagnoses is unknown in practice, it is possible that bias caused from MicDaD is larger than in the simulations and settings considered in our study. Particularly in the case of chronic diseases with high incidences (and a mortality similar to that used in the presented simulation study), a serious underestimation of the risk of death caused by or with a chronic disease is to be expected. Results in the low incidence setting (based on incidence and mortality rates for lupus erythematosus) lead to the suggestion that for a chronic disease with a low incidence, MicDaD has less impact. But still MRR is underestimated with values shifting towards 1. Bigger differences between MicDaD and no MicDaD and greater uncertainty in the low incidence setting (for example at age 80 years: without MicDaD: 1.76 with 2.5–97.5% quartiles: [0.64–23.88] vs. with MicDaD: 2.27 with 2.5–97.5% quartiles: [0.58–113.09]) are potentially caused by low mortality at that age and low incidence in general. In addition to that, differences in bias between younger and older ages (40 and 60–80) were recognized that were potentially caused by the age-dependency of MRR and incidence. Nevertheless, further studies could explore this more detailed.
Our analysis showed that misclassification of disease status at death has an impact on the estimation of mortality rates and MRRs for chronic diseases (especially for chronic diseases with higher incidences). MicDaD leads to an underestimation of the MRR of diseased and non-diseased individuals. This underestimation results in a misinterpreted risk of death with chronic diseases.
Limitations
Our study is the first work to examine the influence of misclassification of disease status at death on the estimation of MRR, but there are limitations.
First, the current analysis only considers age at entry and exit into states in the IDM. The duration in a state remained unconsidered in our analysis and in the estimation of mortality rates and the related MRR. Duration influences mortality and should be considered in future research.
A second limitation is that we neither controlled nor systematically evaluated the amount of missing information on disease status at death. A random age for one-time participation in the interview was determined and compared to age at diagnosis to decide whether information on diagnosis remained unseen. In further analyses, the amount of missing “diagnoses” could be systematically considered and varied to capture susceptibility to bias in the estimate of the MRR, depending on the amount of missing diagnoses. As a possible solution to this problem, simulation of age at study participation could be performed based on different statistical distributions, or on a distribution based on the actual ages at study participation in studies like NHIS. This approach can be used to systematically analyze whether or to what extent age at study participation influences bias caused by MicDaD. Additionally, it can be examined how the amount of missing information about disease status at death influences the magnitude of this misclassification bias. It is possible that higher ages at study participation and a smaller amount of missing information could reduce the bias caused by MicDaD. However, further research is needed.
We simulated age-dependent mortality rates only. In contrast to Binder et al. (2014) [
4] which considered constant transition intensities, the age-dependency of the transition rates in the IDM is a strength of our study because real life incidence and mortality are dependent on age. But in practice, more (risk) factors, such as sex and social economic position, influence mortality and could be considered in future research.
A fourth limitation of our simulation study is population size. We only simulated one population with 100 000 individuals and sampled sub-populations with 5000 individuals. It is possible that extent of bias caused by MicDaD can differ in bigger or smaller populations. As the population size possibly influences the estimation of mortality rates, further simulation studies can be used to investigate the impact of population size on the bias in estimating MRR with MicDaD.
We considered only two different scenarios in our analysis. Thus, the analysis was based on only two possible combinations of age-dependent incidence and mortality rates (based on real-world data from type 2 diabetes and lupus erythematosus). Further analyses are needed to obtain a systematic analysis of the susceptibility and the extent of a bias by MicDaD in relation to mortality and incidence. For this reason, more and different scenarios with varying mortality and incidence rates and combinations of these may be needed.
A fifth limitation is that we reduced misclassification of disease status at death to a problem with one timescale only (age of individuals) and an ordinary differential equation.
Because it is possible that other times, such as year (calendar time), also influence the MicDaD-induced extent of bias in estimating the MRR, further research that includes a description of transitions in the IDM is needed. Furthermore, partial differential equations with age and calendar-time as time scales (see Brinks et al. (2016) [
12]) may be conducted.
In the current simulation, individuals are surveyed (interview) only once in their lifetime. However, it is possible that study participants can be interviewed more than once. Additional analyses with the possibility of repeated study participation could be conducted.
In order to achieve a more detailed description of the extent of missing information of disease status at death, we plan to perform further research considering risk factors, other scenarios with varying mortalities and incidences, and different amounts of missing information and population sizes. Additionally, we will add a second timescale (calendar-time) to our research and consider the possibility of repeated study participation and duration in a state in the IDM.
An additional limitation is that we only used 2 settings (high incidence with later age at onset; low incidence with earlier age at onset) based on the age-depedent incidence (and thus indirect regulation of the extent of MicDaD) for the simulation. A wider range of incidence settings would allow a more detailed analysis of different possible situations and frequencies of MicDaD and give a more diverse picture of the effects of MicDaD. However, the aim of our work was a first time description of this important epidemiological phenomenon that potentially plays a role in many studies. The effect of MicDaD (direction and magnitude of the effect of this misclassification) in studies with this design was completely unknown until now. Our goal was no comprehensive simulation study, but to gain a first insight into this complex topic.
Comparative literature
In contrast to our analysis of mortality rates, Binder et al. (2014) [
4] performed a study on the extent of bias in estimating the hazards of risk factors. As they did not evaluate bias on estimation of mortality, the study is less comparable to our evaluation. Binder et al. evaluated four scenarios with different transition rates in the IDM (mortality rates and incidence rate) and three risk factors with varying impact on these transitions and our study did not investigate risk factors. A second difference to our study was that Binder et al. (2014) assumed mortality and incidence rates to be constant over time. The authors found no impact of misclassification of disease status (and therefore no bias caused by misclassification) on the estimation of hazards for risk factors when mortality rates for diseased and non-diseased individuals were the same and constant over time. These results cannot be compared to our results as we neither had constant nor identical mortality rates for healthy and diseased individuals. In settings with differing mortality rates, the authors found that bias was growing for higher fractions of missing disease information. Bias in estimation of hazards for risk factors was dependent on the constellation of mortality and incidence rates; for higher mortality of healthy individuals the effect of a risk factor is overestimated while it was underestimated for high incidence rates. In our analyses we saw a comparable tendency in the estimation of mortality rates as MRR was underestimated in the high incidence setting with MicDaD. An important limitation of the study from Binder et al. (2014) was that they had constant hazards only (transition rates in the IDM); although, mortality is known to be age-dependent.
Another study by Binder et al. [
4] investigated how unknown disease status leads to over- and underestimation of effect size estimates for risk factors. Moreover, they revealed that nearly half of all prospective cohort studies are at risk of this bias, especially when data analysis is performed using standard methods instead of methods they describe in their publication [
13].
Another concept of bias in observational studies is the immortal time bias (ITB) that is possible in epidemiological studies when a treated (exposed) and a non-treated (non-exposed) group are compared. Individuals in the treated group are immortal before study participation [
14]. ITB overestimates treatment effect on death whereas MicDaD underestimates mortality caused by the chronic condition of interest without consideration of any treatment.
So the difference between MicDaD and ITB is that MicDaD is caused by diseased individuals that are falsely treated as healthy when having the outcome ‘death’, whereas ITB concentrates on diseased individuals with known diagnosis that are unable to die. Additionally, death is impossible for an individual with ITB but with MicDaD individuals die without diagnosis. Therefore, results of studies analysing ITB are less comparable to our study.