Background
Risk adjustment is an important component in clinical epidemiology and health services research using administrative databases, but its methods remain controversial. Administrative databases are widely used in studies because of their availability and large sample sizes, and risk-adjusted mortality is employed as one of the outcome measures. However, the validity of risk-adjustment models for administrative data has been questioned repeatedly [
1‐
4]. It has been argued that administrative data lack important clinical information [
5‐
8] and often do not make distinctions between conditions present on admission and complications occurring during hospitalization [
6‐
10]. Inadequate risk adjustment can lead to misleading consequences such as confounding by indications and low rating of facilities that care for sicker patients. Thus, appropriate risk-adjustment models are desired.
Previous studies have shown that the performance of risk-adjustment models using administrative databases improves when detailed clinical information is added. In addition to patients’ demographic characteristics, comorbid illnesses recorded in administrative data enabled risk adjustment using measures such as the Charlson comorbidity index (CCI) [
11]. Furthermore, models using laboratory data, vital signs, and other clinical findings provided better predictions of mortality [
12‐
15], and models using disease-specific diagnostic tests and treatments have been introduced for some diseases [
16‐
19]. Meanwhile, precise laboratory and clinical data are not available in most administrative databases. Therefore, an alternative method has been reported, in which surgeries and major therapeutic procedures are associated with in-hospital death [
20].
In addition to major therapeutic procedures, commonly performed procedures, diagnostic or therapeutic, can reflect the severity of patients on admission. For example, patients who receive oxygen therapy are expected to be in a severe condition compared with those who do not. However, there have been no evaluations of risk-adjustment models that use commonly performed procedures. In addition, previous models using laboratory and clinical data were developed and validated in limited regions.
The aims of the present study were to develop an index of severity using procedure records in a nationwide database, and to examine the ability of this index to predict in-hospital death.
Methods
Data source
The Diagnosis Procedure Combination database is a national administrative database of acute-care inpatients in Japan that is linked with a payment system. The mandatory-participating academic hospitals (all 82 hospitals) and voluntary-participating community hospitals provide claims data of all of their acute-care inpatients. In 2012, there were approximately 1,000 participating hospitals with 7 million admissions recorded annually, representing 50 % of all acute-care hospitalizations in Japan.
The database includes the following data: hospital identification code; patient demographics; diagnoses; admission and discharge status; surgeries and procedures performed; drugs used; and special reimbursements for specific conditions. Up to 12 diagnoses for each admission are recorded, and coded using the
International Classification of Diseases, Tenth Revision (ICD-10). One diagnosis each is recorded for “main diagnosis,” “admission-precipitating diagnosis,” “most resource-consuming diagnosis,” and “second most resource-consuming diagnosis.” A maximum of four diagnoses each are recorded for “comorbidities present on admission” and “conditions arising after admission.” Suspected diagnoses are allowed to be recorded, in which case they are designated as such. Surgeries, drugs, procedures, and special reimbursements are coded according to the Japanese fee schedule for reimbursement [
21], and their dates of use or application are recorded. The daily quantities of each drug administered are also recorded.
Study cohort
We included all adult patients (≥18 years) discharged between 1 April 2012 and 31 March 2013 with a confirmed admission-precipitating diagnosis of acute myocardial infarction, congestive heart failure, acute cerebrovascular disease, gastrointestinal hemorrhage, pneumonia, or septicemia. The identification of these six diseases was based on the Classifications Software for Mortality Reporting developed by the Healthcare Cost and Utilization Project [
22], and the following Classifications Software categories were used for the six diseases, respectively: 100, 108, 109, 153, 122, and 2. For congestive heart failure, we also included hypertensive heart disease with heart failure (ICD-10 code: I11.0, I13.0, or I13.2). We excluded the following patients based on their information on the day of admission: those who were admitted to intensive care unit (including coronary care unit); and those who received cardiopulmonary life support (cardiopulmonary resuscitation, electrical cardioversion, cardiopulmonary bypass, extracorporeal membrane oxygenation, or ventricular assist device). We identified the former using reimbursement information, and the latter using procedure information.
The data for diagnostic and therapeutic procedures performed on the day of admission, use of catecholamines (epinephrine, norepinephrine, dopamine, and dobutamine) and vasopressin on the day of admission, and use of blood transfusions (red blood cells, platelets, fresh frozen plasma, and albumin) on the day of admission were extracted. A list of the procedures and codes examined in this study is shown in the Additional file 1. For the examinations, examples of the tested items are also listed. Patients who underwent at least one procedure categorized under a given code were assigned that specific code. For example, “D007, blood chemistry tests” would be coded for patients who underwent creatinine testing, as well as for patients who underwent sodium, potassium, and chloride testing. Comorbidities were examined using the diagnoses recorded as comorbidities present on admission, and CCI values were calculated using the coding algorithm [
23] and weight assignment [
24] reported by Quan et al.
We randomly assigned the eligible patients to the derivation cohort or validation cohort. We developed the severity index for inpatients using the derivation cohort, and tested its performance in the validation cohort.
Index development
In the derivation cohort, we first examined the proportion of patients who underwent each procedure (including use of catecholamines and vasopressin) on the day of admission. For each procedure with ≥1 % prevalence, the chi-square test was used to evaluate the association with in-hospital death. The procedures positively associated with in-hospital death (P < 0.1) were retained for further analysis. Procedures with a correlation (phi coefficient >0.6) were managed in the following manner: (i) a group of procedures usually performed simultaneously were combined into a single variable as at least one procedure; and (ii) for a group of procedures performed consecutively, only the procedure usually performed first was retained. Subsequently, a logistic regression model was developed with in-hospital death as the outcome variable. In the model, the admission-precipitating diagnosis, age, sex, and CCI were included as categorical covariates (age categories: <60, 60–69, 70–79, 80–89, ≥90; CCI categories: 0, 1, 2, ≥3) in addition to the procedures.
Using the statistically significant (
P < 0.05) regression coefficients obtained with the model, we derived an index-calculating formula by the method of Sullivan et al. [
25], using CCI = 1 as a reference. Specifically, a point was assigned to each procedure so that it equaled the integer nearest to the quotient of the regression coefficient for the procedure divided by the regression coefficient for CCI = 1. Thus, the points for each procedure were derived to represent the effect on death relative to the CCI. The severity index for each patient could then be calculated as the sum of the points assigned to the procedures performed on the patient.
Index validation
The severity index was calculated for patients in the validation cohort. We examined the distribution of its values, and used a logistic regression model with the index as a continuous variable (model 1) to examine its association with in-hospital death. For every value, the expected death rate among patients with the value was compared with the observed death rate.
We then constructed multiple logistic regression models with different independent variables: severity index, diagnosis, age, and sex (model 2); diagnosis, age, sex, and CCI (model 3); severity index, diagnosis, age, sex, and CCI (model 4). The discriminatory abilities of the different models were assessed using the
c-statistics. We used the integrated discrimination improvement (IDI) [
26] to evaluate the improvement of model discrimination by adding the severity index. The IDI is a difference in the discrimination slope (difference between the mean predicted probability of an event for those with events and the corresponding mean predicted probability of an event for those without events) between two models and is a measure of the improvement in model performance. In this study, the IDI was calculated for a comparison of model 4 with model 3.
We evaluated the relative contribution of the severity index to the prediction of death using the
ω-statistic [
27]. The
ω-statistic is the ratio of the variances of the contributions of two groups of variables to the log-odds of the outcome in a logistic regression model. In this study, we used model 4, and compared the relative contribution of the severity index with that of four other variables. In addition, the calibration of model 4 was evaluated using the Hosmer–Lemeshow decile partition.
We conducted further analyses to test the performance of the severity index across various subgroups of patients. Using the severity index derived from all patients in the derivation cohort, model 4 was constructed for the following subgroups of the validation cohort: those who arrived in an ambulance and those who did not; those who were referred by another institution and those with no referral. We also built models for each admission-precipitating diagnosis, with severity index, age, sex, and CCI as independent variables. The model discrimination and calibration were evaluated for each subgroup.
The P values were 2 sided. Statistical analyses were performed using IBM SPSS for Windows, version 22.0 (IBM Corp., Armonk, NY, USA). Because of the anonymous nature of the data, the need for informed consent was waived. Study approval was obtained from the Institutional Review Board of The University of Tokyo.
Discussion
Using the Diagnosis Procedure Combination nationwide administrative database of acute-care hospitals, we derived and internally validated a severity index for inpatients that utilizes procedure records to predict in-hospital death. In the patients with the six diseases examined, the index was widely distributed, and the model combining the severity index with age, sex, and CCI predicted in-hospital death well (c-statistic: 0.767).
We used procedures performed on the day of admission as indicators of severity on admission, and extracted 19 commonly performed procedures, diagnostic and therapeutic, that were significantly associated with in-hospital death or survival. The characteristics of the procedures differed widely, from routinely performed procedures (e.g. blood examinations) to those reflecting critically ill conditions (e.g. intratracheal intubation). This difference was represented in the weights given to each procedure, ranging from −3 to 23. The weights represented the strength of association between each procedure and death, relative to an increase in the CCI. The weighted numbers of the performed procedures were then summed into an index with a possible range of −13 to 69.
The mortality-predicting model with only diagnosis, age, sex, and CCI (model 3) had a fair discriminating ability (c-statistic: 0.675), and there was a significant improvement on the model performance when the severity index was added (IDI: 0.0700; c-statistic of model 4: 0.767). Furthermore, in model 4, the index contributed to the prediction of death more than all the other variables combined (ω-statistic: 1.09). These results represent the importance of the severity index for predicting mortality. To our knowledge, this is the first study to examine a mortality prediction model with commonly used procedures and medications, and the results suggest the usability of procedure records for risk adjustment.
Similar to other studies [
12,
13], we chose six high-impact medical conditions (acute myocardial infarction, congestive heart failure, acute cerebrovascular disease, gastrointestinal hemorrhage, pneumonia, septicemia) as the target diseases. Although the single model had a good discriminating ability and was well-calibrated across various subgroups, the
c-statistics ranged from 0.70 for septicemia to 0.82 for acute myocardial infarction. A previous study that used demographics, admission diagnosis, comorbidity-based score, and laboratory-based score as variables had similar results, in which the
c-statistics were ≥0.80 for 29 admission diagnoses, 0.71–0.80 for 13 admission diagnoses, and <0.70 for two admission diagnoses [
14]. Use of the same model for various primary illnesses may result in variable predictive ability across diagnoses because the effects of procedures on mortality may differ across diagnoses. In addition, the main diagnosis or main therapeutic procedure themselves are predictors of mortality [
14,
15,
20]. Therefore, care should be taken when comparing these results with models derived separately for different diagnoses, which often yield higher
c-statistics [
12,
13,
16‐
19].
Although comorbidities recorded in administrative databases have provided fairly good predictions of mortality, there have been concerns that diagnoses may reflect complications instead of comorbidities [
6‐
10]. The use of numerical laboratory data is one method suggested by researchers, and high model
c-statistics of >0.8 were observed [
12‐
16]. When available, laboratory data provide precise information about patient severity on admission and help to improve the model performance. However, implementation of an administrative database with laboratory data requires considerable cost and effort, and previous studies were thus confined to regional databases. In contrast, our study was conducted using a preexisting nationwide administrative database, and the procedures added considerable predictive ability to a model using demographics and comorbidities. The method presented here could be useful for similar databases with procedure data. For databases without procedure data, we recommend adding such data because it is relatively inexpensive and useful for mortality prediction.
Our study has several strengths. First, it was conducted using a nationwide database, and included patients of all ages treated in hospitals with different characteristics in all areas of Japan. Second, chronological information was considered in the diagnoses and procedures. We used the “admission-precipitating diagnosis” for case identification and the “comorbidities present on admission” for comorbidities. Similar to the use of “present on admission” codes in the previous US studies [
12,
28] and “diagnosis-type indicator” codes in the previous Canadian studies [
9,
29], our method prevents the misclassification of complications occurring during hospitalization as main diagnoses or comorbidities. Likewise, the information regarding dates of performance of procedures enabled the extraction of procedures performed on the day of admission. Third, we used the aspects of whether or not the procedures were performed as variables. Although procedure data are not as objective as automatically-recorded laboratory data, we believe that the validity of procedure data is higher than that of recorded diagnoses.
This study has several limitations. First, we only examined six medical conditions. It is unknown whether the severity index for inpatients developed in the present study is applicable to patients hospitalized for other conditions. Second, we excluded patients with critically ill conditions on the day of admission, because we expected that the associations of procedures with mortality would be different in these patients. Although some severe patients, such as intubated patients and those on catecholamines, were treated in general wards, as sometimes occurs in Japan [
30,
31], and were thus included in the analysis, the issue of whether the index is valid for most critically ill patients, e.g. those admitted to the intensive care unit, requires further examination. Third, we limited the drugs to catecholamines and vasopressin, but other treatments such as intravenous fluids and antibiotics could also represent the severity on admission. Fourth, each admission was considered independent in the analyses. Better mortality prediction may be possible when clustering within patient and within site is taken into account. Also, the variance within each procedure, e.g., numbers or types of tested items within a blood test, was not accounted for. Last, the study was conducted in Japan using procedure codes in the Japanese fee reimbursement system, and the use of this index in other countries with different routine practices and coding systems will require appropriate conversions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
HY1 designed the study, conducted analyses, interpreted the results, and drafted the manuscript. HM designed the study, interpreted the results, and revised the manuscript. KF collected the data, interpreted the results, and revised the manuscript. HY2 designed the study, collected the data, interpreted the results, and revised the manuscript. All authors have read and approved the final manuscript.