Background
Administrative databases are often used to create models to predict clinical outcomes, in particular survival [
1]-[
4]. Most cancers are fast growing, and once diagnosed have an enormous impact on survival. Therefore, commonly, models to predict survival among these subjects include detailed oncologic information [
5]-[
7]. However, earlier cancer diagnoses and advances in treatment have been associated with reduced cancer mortality, such that in 2003 there were an estimated 10 million cancer survivors in the United States [
8]. Consequently, patients are living longer after a diagnosis of cancer to the point where existing comorbidities may have a substantial impact on their overall survival.
Prostate cancer is the most common form of non skin cancer diagnosed in men, with three quarters of cases occurring in men aged 65 years and older [
9],[
10]. Prostate cancer is slow growing. Accordingly, death among prostate cancer patients is more likely to be associated with a subject’s comorbidities than prostate cancer itself [
11]-[
13]. This is particularly true among patients with diabetes [
14],[
15].
Capturing information from pathology data is labor intensive and expensive. Therefore, if the addition of these pathology clinical variables to a predictive model with variables attained solely from administrative data does not enhance model performance, their inclusion should be avoided. The objective of this study was to quantify the impact of adding Gleason grade and cancer volume (obtained from chart review) to a predictive model for mortality among a cohort of elderly men with incident diabetes and prostate cancer. We further aimed to distinguish all-cause mortality from prostate-cancer-specific mortality. We hypothesized that pathology data may have a great impact on disease-specific mortality, but a smaller or even a null effect on all-cause mortality.
Results and discussion
Overall, 4856 incident diabetic men older than 66 who subsequently developed prostate cancer were identified. Pathology reports were available for 4001 (82%) who were included in the analysis. The median (IQR) age at PC diagnosis was 75 (72–79) years (Table
1). During a median (IQR) follow up of 4.7 (2.7-7.3) years, 1395 (35%) individuals died, with 321 patients dying of PC (8.5%). At time of PC diagnosis, 1007 patients (25.2%) had high grade (Gleason score ≥ 8) and 2245 (56%) had high volume tumors (tumor volume ≥ 30%).
Table 1
Baseline cohort (n = 4001)
Age (years)at index date, median (IQR)
| | 75 (72-79) |
Follow-up time (years), mean (SD) before prostate cancer
| | 2.9 (1.2-5.2) |
Follow-up time (years), mean (SD) from prostate cancer to end of follow up
| | 4.7 ( 2.7-7.3) |
Gleason grade at presentation
|
Low grade
| 1574 (39.3%) |
Intermediate
| 1420 (35.4%) |
High grade
| 1007 (25.2%) |
Primary treatment
|
Surgery
| 317 (7.9%) |
Radiation
| 937 (33.2%) |
Watchful waiting
| 1740 (43.5%) |
ADT
| 1329 (46%) |
Volume of prostate cancer
|
High (>30%)
| 2245 (56%) |
Low (≤30%)
| 1753 (44%) |
TUR diagnosis n (%)
| | 681 (18%) |
Co morbidity sum of ADGs n (%)
|
5 or less
| 1212 (30.3%) |
6-9
| 1951 (49%) |
10 or more
| 838 (20.7%) |
*SES status n (%)
|
1
| 769 (20.3%) |
2
| 838 (22.2%) |
3
| 764 (20.2%) |
4
| 693 (18.3%) |
5
| 717 (19%) |
Urban n (%)
| | 545 (85.6%) |
Prostate cancer specific death n (%)
| | 321 (8.5%) |
Overall mortality n (%)
| | 1395 (35%) |
The multivariable models to predict all-cause mortality are described in Tables
2 and
3 (Table
2- the model consisting of variables derived from administrative data only; Table
3- the model that included Gleason grade and cancer volume in addition to variables derived from administrative data). In both models age, year of cohort entry, rural residence and comorbidities were independent predictors of all-cause mortality. Higher Gleason grade and cancer volume (Table
3) were also associated with increased all-cause mortality, even after controlling for the effect of other variables. The accuracy of the models (i.e. c-statistic) to predict 5-year mortality were 0.7 (95% CI 0.69-0.71) and 0.74 (95% CI 0.73-0.75) for the admin-data only and the extended model (including pathology information), respectively. This corresponded to an incremental increase of 0.04 (95% CI 0.03-0.05) in the c-statistic.
Table 2
Administrative data only model to predict all-cause mortality
Age | 1.113 (1.102-1.123) | <0.0001 |
Year of cohort entry | 0.952 (0.937-0.967) | <0.0001 |
Rural | 1.28 (1.12-1.47) | 0.0003 |
Co morbidity ADGs | Low | Ref | |
Intermediate | 1.3 (1.14-1.5) | <0.0001 |
High | 1.64 (1.45-1.87) | <0.0001 |
SES status | 1 | Ref | |
2 | 0.941 (0.81-1.1) | 0.43 |
3 | 0.94 (0.8-1.1) | 0.42 |
4 | 0.78 (0.66-0.92) | 0.004 |
5 | 0.87 (0.75-1.0) | 0.11 |
Table 3
Extended model with pathology to predict all-cause mortality
Age | 1.1 (1.093-1.15) | <0.0001 |
Year of cohort entry | 0.95 (0.93-0.96) | <0.0001 |
Rural | 1.29 (1.13-1.48) | 0.0002 |
Co morbidity ADGs | Low | ref | |
Intermediate | 1.31 (1.15-1.51) | <0.0001 |
High | 1.65 (1.45-1.89) | <0.0001 |
SES status | 1 | ref | |
2 | 0.94 (0.81-1.1) | 0.43 |
3 | 0.99 (0.85-1.17) | 0.93 |
4 | 0.78 (0.66-0.92) | 0.004 |
5 | 0.85 (0.72-0.99) | 0.05 |
Gleason grade | Low | ref | |
Intermediate | 1.16 (1.01-1.3) | 0.04 |
High | 2.3 (1.97-2.64) | <0.0001 |
Volume of prostate cancer | Low (≤30%) | ref | |
High ( >30%) | 1.14 (1.01-1.29) | 0.036 |
Using the NRI method, we first used the model based on administrative data only, to classify persons’ predicted probability of 5-year mortality into low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. The risk category of predicted probability of 5-year mortality did not change in 85.2% of patient when the extended model was applied (Table
4). Among the 214 (5.3%) subjects with low predicted probability of 5-year mortality, the extended model reclassified 31 (14.5%) subjects to a higher risk group (Table
4). Among the 353 (8.8%) patients with a high predicted probability of 5-year mortality, the extended model moved 122 (34.6%) patients to a lower risk group. Of the 3432 patients classified to 10-50% risk, the extended model moved 221 (6.4%) to the lower risk group and 219 (6.4%) to the higher risk group.
Table 4
Net reclassification improvement
Less than 10% | 183 | 85.5 | 31 | 14.5 | 0 | 0 | 214 | 5.3 |
10-50% | 221 | 6.4 | 2992 | 87.2 | 219 | 6.4 | 3432 | 85.8 |
More than 50% | 0 | 0 | 122 | 34.6 | 231 | 65.4 | 353 | 8.8 |
The multivariable models to predict prostate-cancer specific mortality are described in Tables
5 and
6 (Table
5- the model consisting of variable derived from administrative data only; Table
6- the model that included Gleason grade and cancer volume in addition to the variables in the administrative data model). Higher Gleason grade and cancer volume were important predictors of prostate cancer specific mortality. The accuracy of the models (i.e. c-statistic) to predict 5-year mortality were 0.76 (95% CI 0.74-0.78) and 0.85 (95% CI 0.83-0.87) for the admin-data only and the extended model (including pathology information), respectively. This corresponded to an incremental increase of 0.09 (95% CI 0.07-1.1) in the c-statistic.
Table 5
Administrative data only model to predict prostate cancer specific mortality
Age | 1.104 (1.08-1.13) | <0.0001 |
Year of cohort entry | 0.815 (0.79-0.84) | <0.0001 |
Rural | 1.29 (0.97- 1.97) | 0.0747 |
Co morbidity ADGs | Low | Ref | |
Intermediate | 1.26( 0.95-1.67) | 0.106 |
High | 1.38 (1.04-1.82) | 0.021 |
SES status | 1 | Ref | |
2 | 0.86 (0.62-1.2) | 0.38 |
3 | 1.003 (0.72-1.39) | 0.98 |
4 | 0.81 (0.56-1.16) | 0.24 |
5 | 0.93 (0.66-1.3) | 0.68 |
Table 6
Extended model with pathology to predict prostate cancer specific mortality
Age | 1.084 (1.06-1.106) | <0.0001 |
Year of cohort entry | 0.806 (0.78-0.83) | <0.0001 |
Rural | 1.31 (0.99-1.74) | 0.054 |
Co morbidity ADGs | Low | Ref | |
Intermediate | 1.33 (1.004-1.77) | 0.047 |
High | 1.49 (1.1-1.97) | 0.0043 |
SES status | 1 | Ref | |
2 | 0.846 (0.60-1.18) | 0.32 |
3 | 1.2 (0.87-1.68) | 0.26 |
4 | 0.85 (0.59-1.21) | 0.36 |
5 | 0.88 (0.63-1.25) | 0.49 |
Gleason grade | Low | Ref | |
Intermediate | 1.66 (1.14-2.4) | 0.0076 |
High | 5.97 (4.2-8.47) | <0.0001 |
Volume of prostate cancer | Low (≤30%) | Ref | |
High ( >30%) | 1.62 (1.23-2.33) | 0.0012 |
Using the NRI method, we first used the model based on administrative data only, to classify persons’ predicted probability of 5-year mortality into low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. The risk category of predicted probability of 5-year prostate cancer specific mortality of 928 subjects (28%) in our cohort changed when the extended model was applied (Table
7). Among the 2981 subjects with low predicted probability of 5-year prostate cancer specific mortality (less than 10%), the extended model reclassified 378 (14.5%) to a higher risk group (Table
7). Among the 28 patients with a high predicted probability of 5-year prostate cancer specific mortality (more than 50%), the extended model moved 18 (64.3%) of to a lower risk group. Of the 990 patients classified to 10-50% risk the extended model moved 469 (47.4%) to the lower risk group and 58 (5.9%) to the higher risk group.
Table 7
Net reclassification improvement
Less than 10% | 2603 | 87.3 | 378 | 12.68 | 0 | 0 | 2981 | 74.54 |
10-50% | 469 | 47.37 | 463 | 46.77 | 58 | 5.9 | 990 | 24.76 |
More than 50% | 0 | 0 | 18 | 64.3 | 10 | 35.7 | 28 | 0.7 |
Using data from 4001 elderly male diabetic patients who subsequently developed prostate cancer, we demonstrated that pathology data obtained by chart abstraction improved the accuracy in predicting all-cause and prostate-cancer specific mortality. The benefit in predicting all-cause mortality was modest, evident by only a 0.04 difference in the c-statistic, and by the fact that the extended model (with Gleason grade and cancer volume) changed the risk category of predicted probability of survival for 14.8% of the men in our cohort. In contrast to this, pathology data modified the accuracy in predicting prostate cancer specific mortality considerably. The extended model demonstrated a c-statistic of 0.85 (95% CI 0.83-0.87) compared to 0.76 (95% CI 0.74-0.78) when only administrative data was used. Furthermore, the extended model changed the risk category of predicted probability of 5 years survival for 28% of the men in our cohort.
Most predictive models in prostate cancer rely on clinical data such as PSA, Gleason grade, cancer volume and stage [
32]-[
34]. Detailed clinical information is often missing in large administrative data, and therefore these models have limited use among policy makers and health service researchers. Deriving missing data from administrative database can be accomplished in several methods. One could abstract a random sample of medical records and use that abstraction to develop an algorithm to infer the data [
35]. However this rarely applies to pathology. There is no unified clinical pathway that may identify Gleason grade or volume of tumor. For example two patients with a similar Gleason grade and stage may receive different treatments, and vice versa patients with different pathology may receive similar treatment. Though less efficient, we believe that chart review is the most reliable method for pathology data gathering.
In this study we demonstrated that the value of chart review and detailed pathology data depends on the research question and the accuracy threshold that is acceptable. Changes in NRI risk categories correspond to change in sensitivity at the higher threshold plus change in sensitivity at the lower threshold (personal communication Michael Pencina). For example, if 10% accuracy is needed then chart abstraction is required regardless of outcome. However, in a study were 20% accuracy is acceptable, we demonstrated that chart review is needed only for the outcome of prostate cancer mortality; and not for all-cause mortality.
In our study, we reviewed over 5000 pathology reports- if one assumes a trained chart abstractor (that costs approximately $25 an hour) can review a report in 10 minutes, the total time dedicated to chart review was 840 hours. The total cost associated with this endeavor was $21,000. This may not be feasible for a larger cohort.
Two key aspects of prediction model performance are discrimination and calibration [
36]. Discrimination refers to the ability of a prediction model to distinguish between patients. A typical measure of discrimination is the c-statistic; the c-statistic provides the probability that, for a randomly selected pair of subjects, the model gives a higher probability to the subject who had the event, or who had the shorter survival time. However, one limitation of the c-statistic is that a strong risk predictor may have limited impact [
31],[
36]. Furthermore, the c-statistic is difficult to interpret clinically. Therefore, in our study we used NRI to assess the difference in calibration between the models. Calibration measures whether predicted probabilities agree with observed proportions. Reclassification can directly compare the clinical impact of two models by determining how many individuals would be reclassified into clinically relevant risk strata. Adding Gleason grade and cancer volume to the prediction model for all-cause mortality moved approximately 15% of subjects- however adding the same variables moved approximately 30% of subjects in the model predicting prostate-cancer specific mortality. There are many other tests that may be used to measure predictive accuracy such as calibration plots, decision curves, and integrated discrimination improvement [
37]-[
40]. Since the objective of our study was to establish the importance of pathological data and not form the ideal prediction model we did not utilize all these measures.
In our NRI analysis we used several cut-offs to assess reclassification. We considered the 5-year probability of mortality as low (less the 10%), intermediate (10-50%) or high (more than 50%) risk. Although risk is a continuum, clinicians ultimately have to make binary choices, such as whether or not to treat a subject. This entails consideration of how high a risk is “high enough” to necessitate action. A risk cut-point depends on the relationship between the harms of an event and the harms of needless treatment. We therefore set our cut-points to be clinically relevant. Although the intermediate risk category range is rather large (10-50% risk), these patients are often grouped together and treatment decisions are made at the extremes. Since we aim to assess the utility of adding clinical variables to a prediction model, we believe that these categories are sufficient.
There are several limitations to our study. First, our population was older, diabetic, and had worse Gleason grade distribution than the general population of PC patients, thus generalizability to a contemporary PC cohort is tempered. It is possible that among a younger cohort pathology may have a greater impact on both all-cause mortality and prostate cancer-specific mortality. However, nearly three quarters of men with prostate cancer are aged 65 and older at the time of diagnosis and most of these patients have other comorbidities such as hypertension, ischemic heart disease, diabetes, etc. [
9],[
13],[
15] Restricting our cohort to incident diabetics who subsequently develop prostate cancer served to create a more homogeneous cohort and minimize the possibility for misclassification of comorbidities a common problem with administrative data [
41],[
42]. Second, we did not include treatment (such as radiation or surgery) in our predictive models. Since these variables are post baseline (i.e. they occur after the diagnosis) they need to be modeled as time-dependent covariants. If they are not modeled appropriately that introduces immortal bias, since in order to receive a treatment one obviously needs to survive until that time. Since there is no known way of calculating the c-statistic or NRI for Cox proportional models with time dependent covariates we have decided not to include them in our model. Furthermore, since treatment would have been included in both the administrative data and the extended models, their exclusion should not have changed our results. Finally, we lack data on severity of diabetes, body mass index, and prostate cancer stage even in the model including pathology. Further studies are needed to address the impact of adding these variables.
Acknowledgements
Refik Saskin, MSc (ICES), provided discussions on statistical analysis and aided in data acquisition. Dr Iliana Lega provided discussion on drug exposure capture as well as study design. Dr Hadas Fisher provided discussion on study design and covariate capture. All contributors were not paid for their contribution.
Funding sources
This study was supported by a Canadian Cancer Society Research Institute (CCSRI) Prevention Grant #2011-701003. This study was conducted at the Institute for Clinical Evaluative Sciences (ICES), which is funded by an annual grant from the Ontario Ministry of Health and Long-Term Care (MOHLTC). Dr. Bell is supported by a Canadian Institutes of Health Research and Canadian Patient Safety Institute Chair in Patient Safety and Continuity of Care. Dr. Austin was supported in part by a Career Investigator Award from the Heart and Stroke Foundation of Ontario. Dr. Lipscombe is supported by a Canadian Diabetes Association/CIHR-Institute of Nutrition, Metabolism and Diabetes Clinician Scientist Award Institute Chair in Patient Safety and Continuity of Care.
Disclaimer
The opinions, results, and conclusions reported in this article are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
DM had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. DM carried out data collection and assembly, drafted the initial version of the manuscript, and participated in data and statistical analysis, interpretation of the data, and critical revision of the manuscript for important intellectual content. DU, LL, CB, GK, JB, and NF participated in analysis and interpretation of the data, critical revision of the manuscript for important intellectual content, and study supervision. PA conceived the study and its design and participated in data and statistical analysis, interpretation of the data, critical revision of the manuscript for important intellectual content, and study supervision. DM, DU, LL, JB, CN, and NF obtained funding. All authors read and approved the final version of this manuscript.