Exposure variable definitions
The primary exposure of interest is receipt of an aggressive treatment regimen, which has previously been shown to improve treatment outcomes [
22‐
25]. The aggressive treatment regimen is defined as a regimen containing at least five likely effective drugs based on previous treatment history and current drug resistance patterns during the intensive phase of treatment, followed by at least four likely effective drugs during the continuation phase of treatment [
24‐
26]. A binary variable was used to classify each patient as ever or never having been exposed to an aggressive treatment regimen.
Other characteristics included are those previously identified as being risk factors for death [
13,
21,
27,
28], including age, sex, alcohol abuse or dependence, presence of a comorbidity, prior treatment history, low body mass index (BMI), severe baseline clinical status, extra-pulmonary TB (EPTB), and extensively drug-resistant (XDR-) TB. Alcohol abuse or dependence was determined at baseline or at the time of the doctor prescribing medication. The presence of a baseline comorbidity (other than HIV) is defined as the presence of any of the following: diabetes mellitus, chronic renal insufficiency, seizure disorder, baseline hepatitis or transaminitis, or psychiatric disease. Prior treatment history is classified as more than two or less than or equal to two previous regimens. Low BMI is defined as < 20 kg/m
2 for men and < 18.5 kg/m
2 for women. Severe baseline clinical status is defined as respiratory insufficiency, hemoptysis, or sputum acid-fast bacilli smear (+++) at baseline [
22]. XDR-TB is defined as the resistance to isoniazid, rifampin, any fluoroquinolone, and at least one of three second-line injectable drugs [
29,
30].
Statistical methods
To characterize the population, we describe demographic information, comorbidities, treatment characteristics, and treatment outcomes. Characteristics are quantified by the frequency and percent for categorical variables and means and standard deviations (SD), unless noted otherwise, for continuous variables. Selection bias is evaluated by assessing whether patient characteristics and treatment outcomes are statistically different between included and excluded participants through use of chi-square, Fishers exact test, or t-test.
Our analysis involves a two-step procedure. First, a logistic regression model is fit to predict the probability of survival at the end of the study period. Second, a Cox proportional hazards model is fit, incorporating recoded failure and censoring outcomes based on the vital status predicted in the logistic regression model.
A logistic regression model is used to predict the probability of survival at the end of the study period for each individual, i, who experienced a non-death initial treatment outcome. Vital status is modeled as a random variable, taking the value 1 with probability equal to the parameter pi, which is a function of the initial treatment outcome (Oi) and patient characteristics (Xi). The parameter pi is estimated for each individual in the cohort.
Potential predictors eligible for the model include all combinations of the initial treatment outcomes and patient characteristics that may be associated with survival. Patient characteristics considered include those that are standardly collected globally, ensuring that the model may be applied to other TB cohorts in the future for external validation. For model derivation and internal validation, we use 10-fold cross-validation. Data are randomly divided into ten sets, the model is built on nine of these sets and then the performance of the model is measured on the remaining set. This is repeated until all ten data sets are used to test model performance. The model with the best performance is selected as the final model.
The primary means of comparing predictive models is the Bayesian Information Criterion (BIC) [
31], for which lower values indicate better fit. We also use the
c-statistic to assess model discrimination, the ability of the model to differentiate between individuals who died at the end of the study and those who did not. The larger the
c-statistic, the better the model discriminates [
32]. To assess model calibration, which describes the agreement between the predicted and observed risks, we compute the Hosmer-Lemeshow statistic [
33]. We define good calibration as a Hosmer-Lemeshow statistic
p-value greater than the type-one error rate of 0.05, indicating no evidence that the observed and predicted risks significantly differ.
A receiver operating characteristics (ROC) curve is used to select a probability threshold, through use of the Youden’s index, that maximizes the discriminative properties, including sensitivity, specificity, positive predictive value, and negative predictive value of the model. The Youden’s index is the vertical distance from the ROC diagonal chance line to each point on the curve and aims to minimize the false negative and positive rates [
34]. Discriminatory property definitions are as follows: sensitivity is the probability of the model predicting survival at the end of the cohort period given the individual truly survived; specificity is the probability of the model predicting death prior to the end of the cohort period given the individual truly died; positive predictive value is the probability of surviving until the end of the cohort period given the model predicts survival; negative predictive value is the probability of dying prior to the end of the cohort period given the model predicts death. The probability threshold identified is used to assign each individual a vital status of alive (
\( {\widehat{Y}}_i=1 \)) or dead (
\( {\widehat{Y}}_i=0 \)) at the end of the study period (i.e., if the probability threshold is set at 0.85, then if
\( {\widehat{p}}_i \) > 0.85,
\( {\widehat{Y}}_i=1; \) if
\( {\widehat{p}}_i \)< 0.85,
\( {\widehat{Y}}_i=0 \)).
To evaluate the bias introduced when survival information after the initial treatment outcome is lacking, we run two Cox proportional hazards models. Each model uses three different approaches for a total of six scenarios. Models 1 and 2 both assess the association between receipt of an aggressive treatment regimen and death. Model 1 assesses the univariate association, while Model 2 assesses the association while controlling for the covariates described earlier that were previously found to be associated with time to death.
The three approaches we use on each model are as follows:
-
Approach 1: The first approach follows the conventional censoring assumption in which the event time for each individual is either the observed time to death or the time to the observed non-death treatment outcome, at which point censoring occurs.
-
Approach 2: The second approach uses the predicted vital status at the end of the study period (\( \widehat{Y} \)). All individuals assigned a \( {\widehat{Y}}_i=1 \) are assumed to survive at least until the end of the cohort period and contribute full survival time during that period. All individuals assigned a \( {\widehat{Y}}_i=0 \) are assumed to be at equal risk of death as those at-risk individuals remaining in the cohort. These observations are censored at the time of an observed non-death treatment outcome.
-
Approach 3: The third approach, the gold standard, utilizes the true vital status at the end of the study (Yi). Individual event times are either the time of death or time to the end of the cohort period, at which point all remaining, alive individuals are censored. This approach serves as the reference, against which values obtained from Approaches 1 and 2 are compared.
Estimated hazard ratios (HR) and 95% confidence intervals (CI) for the aggressive treatment regimen variable are presented for each model and approach. Relative change between the HRs for each model are calculated by comparing Approaches 1 and 2 to those from Approach 3. Relative to Approach 3, HRs closer to the null hypothesis of 1.0 underestimate the treatment effect, while HRs further from 1.0 overestimate the treatment effect. The magnitude and direction of the bias from Approaches 1 and 2 are assessed. Relative changes are compared to identify which approach produces the least biased effect estimates in relation to Approach 3.
SAS V9.4 (SAS Institute, Cary, NC) is used for all analyses.
Institutional Review Boards at Harvard School of Public Health (Boston, Massachusetts) and the Siberian State Medical University (Tomsk, Russia) approved the parent study. Secondary analysis was reviewed and declared exempt by the Institutional Review Board at Northeastern University (Boston, Massachusetts).