Abstract
A clinical risk classification system is an important component of a treatment decision algorithm. A measure used to assess the strength of a risk classification system is discrimination, and when the outcome is survival time, the most commonly applied global measure of discrimination is the concordance probability. The concordance probability represents the pairwise probability of lower patient risk given longer survival time. The c-index and the concordance probability estimate have been used to estimate the concordance probability when patient-specific risk scores are continuous. In the current paper, the concordance probability estimate and an inverse probability censoring weighted c-index are modified to account for discrete risk scores. Simulations are generated to assess the finite sample properties of the concordance probability estimate and the weighted c-index. An application of these measures of discriminatory power to a metastatic prostate cancer risk classification system is examined.
Similar content being viewed by others
References
Andersen PK, Gill RD (1982) Cox’s regression model for counting procsses: a large sample study. Ann Stat 10:1100–1120
Cheng SC, Wei LJ, Ying Z (1995) Analysis of transformation models with censored data. Biometrika 82:835–845
Danila DC, Fleisher M, Scher HI (2011) Circulating tumor cells as biomarkers in prostate cancer. Clin Cancer Res 17:3903–3912
Danila DC, Anand A, Schultz N, Heller G, Wan M, Sung CC, Dai C, Khanin R, Fleisher M, Lilja H, Scher HI (2014) Analytic and clinical validation of a prostate cancer-enhanced messenger RNA detection assay in whole blood as a prognostic biomarker for survival. Eur Urol 65:1191–1197
Fine JP, Ying Z, Wei LJ (1998) On the linear transformation model for censored data. Biometrika 85:980–986
Gerds TA (2014) Prediction error curves for risk prediction models in survival analysis. R package version 2.4.2
Gerds TA, Kattan MW, Schumacher M, Yu C (2013) Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Stat Med 32:2173–2184
Gönen M, Heller G (2005) Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92:965–970
Grambsch PM, Therneau TM (1994) Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 81:515–526
Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA (1982) Evaluating the yield of medical tests. J Am Med Assoc 247:2543–2546
Kalbfleisch JD (1978) Likelihood methods and nonparametric tests. J Am Med Assoc 73:167–170
Lin DY, Wei LJ, Ying Z (1993) Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika 80:557–572
Mo Q, Gönen M, Heller G (2012) CPE: concordance probability estimates in survival analysis. R package version 1.4.3
Scher HI, de Bono JS, Fleisher M, Pienta KJ, Raghavan D, Heller G (2009) Circulating tumour cells as prognostic markers in progressive, castration-resistant prostate cancer: a reanalysis of IMMC38 trial data. Lancet Oncol 10:233–239
Sprent P (1989) Applied nonparametric statistical methods, 2nd edn. Chapman and Hall, London
Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ (2011) On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30:1105–1117
Yan G, Greene T (2008) Investigating the effects of ties on measures of concordance. Stat Med 27:4190–4206
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 The asymptotic distribution for the CPE excluding ties
It is assumed that the proportional hazards model \( h(t|\varvec{X}) = h_0(t) \exp (\varvec{\beta }_0^T \varvec{X}) \) is the correct specification for the data and that the standard conditions for the asymptotic normality of \(n^{1/2} (\hat{\varvec{\beta }} - \varvec{\beta }_0)\) apply (Andersen and Gill 1982).
Consistency of the CPE excluding ties: \( \ \ K_{n,E}(\hat{\varvec{\beta }}) \ \mathop {\rightarrow }\limits ^{p} \ {\mathcal {C}}_E\) where
Using the proportional hazards specification provided in Sect. 2 and the law of large numbers,
Note that the concordance probability excluding ties may be written as
Now using the independence between subjects, the following relations hold
Substituting (2) into (3), it follows that
and substitution of (4) into the denominator of (1) proves the consistency of the CPE.
Asymptotic distribution of \( \ n^{1/2} \left[ K_{n,E}\big (\hat{\varvec{\beta }}\big ) - {\mathcal {C}}_E \right] \)
To prove the asymptotic normality of the CPE, it is first shown that if
then \(n^{1/2} K_{n,E}(\hat{\varvec{\beta }})\) is asymptotically equal to
where within the indicator functions the estimate \(\hat{\varvec{\beta }}\) is replaced with the true regression coefficient \(\varvec{\beta }_0\) and \(\varvec{X}_{ij} = \varvec{X}_i-\varvec{X}_j\). Note that in the discrete covariate case there are a finite number of covariate values \(\varvec{X}_{ij}\). As a result, (5) is not a strong assumption.
The asymptotic equality is demonstrated by considering the cases \(\varvec{X}_{ij}=0\) and \(\varvec{X}_{ij} \ne 0\) separately.
If \(\varvec{X}_{ij} = 0\), then clearly \(I(\hat{\varvec{\beta }}^T \varvec{X}_{ij} > 0) = I(\varvec{\beta }_0^T \varvec{X}_{ij} > 0)\).
If \(\varvec{X}_{ij} \ne 0\), then using the consistency of \(\hat{\varvec{\beta }}\) and assumption (A.5), for n sufficiently large, \(|\hat{\varvec{\beta }}^T \varvec{X}_{ij}| > \nu > 0\) and \(I(\hat{\varvec{\beta }}^T \varvec{X}_{ij} > 0) = I(\varvec{\beta }_0^T \varvec{X}_{ij} > 0)\) for all \(\varvec{X}_{ij}\) . Therefore, for n large,
A Taylor expansion for the asymptotically equivalent CPE produces
The partial derivative \( \partial \tilde{K}_{n,E}/\partial \varvec{\beta }\) is asymptotically constant. Since \(n^{1/2}(\hat{\varvec{\beta }} - \varvec{\beta }_0)\) has asymptotic mean zero conditional on \(\varvec{X}\), it is asymptotically independent of \(n^{1/2} \left[ \tilde{K}_{n,E}(\varvec{\beta }_0) - {\mathcal {C}}_E \right] \). Therefore the asymptotic variance of \( \tilde{K}_{n,E}(\hat{\varvec{\beta }})\) is
The individual components of the asymptotic variance can be estimated as follows. In each case, the substitution of \(\hat{\varvec{\beta }}\) for \(\varvec{\beta }_0\) provides a consistent estimate.
The \(\text{ Var }(\hat{\varvec{\beta }})\) is estimated from the inverse of the second derivative of the partial likelihood.
The partial derivative evaluated at \(\varvec{\beta }_0\) is equal to
The first term, \(n^{1/2} \left[ \tilde{K}_{n,E}(\varvec{\beta }_0) - {\mathcal {C}}_E \right] \), may be approximated by the U-statistic
where \(\pi = \lim _{n \rightarrow \infty } n^{-2} \sum _i \sum _j I(\varvec{\beta }_0^T \varvec{X}_{ij} > 0)\).
The asymptotic variance of this U-statistic is
where
Combining these results provides the estimated asymptotic variance of the CPE excluding ties.
Asymptotic distribution of \( \ n^{1/2} \left[ C_{n,E}\big (\hat{\varvec{\beta }}\big ) - {\mathcal {C}}_E \right] \)
From Sect. 3.2, this expression may be written as
where \(\psi = \lim _{n \rightarrow \infty } n^{-2} \sum _i \sum _j \delta _i I(y_i < y_j) I(\hat{\varvec{\beta }}^T\varvec{X}_i \!\ne \! \hat{\varvec{\beta }}^T\varvec{X}_j) \hat{G}^{-1}(y_i|\hat{\varvec{\beta }}^T\varvec{X}_i)\hat{G}^{-1}(y_i|\hat{\varvec{\beta }}^T\varvec{X}_j)\).
Letting
and Taylor expanding \(\hat{\varvec{\beta }}\) around \(\varvec{\beta }_0\),
The second term may be rewritten using the martingale representation theorem as
where
and \(\Lambda _{\varvec{X}_k}\) is the cumulative hazard of the censoring random variable belonging to group \(\varvec{X}_k\) (Cheng et al. 1995; Fine et al. 1998).
It follows that \(n^{1/2} \left[ C_{n,E}(\hat{\varvec{\beta }}) - {\mathcal {C}}_E \right] \)
Each component is asymptotically normal with mean zero. Analytic estimation of the asymptotic variance, however, requires density estimation of the censoring random variable. Alternatively, a stratified bootstrap resampling approach is utilized to estimate the asymptotic variance.
Rights and permissions
About this article
Cite this article
Heller, G., Mo, Q. Estimating the concordance probability in a survival analysis with a discrete number of risk groups. Lifetime Data Anal 22, 263–279 (2016). https://doi.org/10.1007/s10985-015-9330-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-015-9330-3