Background
Stan
[10], uses the geometry of the parameters space to generate effective and rapid exploration of this space, and stronger guarantees on the convergence [11‐14].Methods
General framework
Joint model with a nonlinear longitudinal biomarker
Individual dynamic predictions
Stan
software version 2.8 [10] and its R
(version 3.1.3) interface. The a priori distribution of the random effects is assumed to be normal of mean zero and variance-covariance matrix Ω, estimated on the training dataset. The a posteriori distribution of the individual random effects defined by the Eq. (4) requires the integration of the hazard function in Eq. (3) of the survival. For that purpose, we use a Gauss-Legendre quadrature of order 8. Of note, and unlike what was proposed by Rizopoulos (2011) [4] the uncertainty in θ is neglected (see “Results”).Model discrimination and calibration
timeROC
of R
[25].Application to risk of death in patients with metastatic prostate cancer
Building a reference nonlinear joint model
-
No link: f(t, ψ i )=0
-
Current PSA: f(t, ψ i )= log(P S A(t, ψ i )+1)
-
Current PSA slope: \(f\left (t,\ \psi _{i}\right)=\frac {d\log (PSA\left (t,\ \psi _{i}\right)+1)}{dt}\)
-
Area under PSA: \(f\left (t,\ \psi _{i}\right)=\int _{0}^{t} \log (PSA\left (u,\ \psi _{i}\right)+1)du\)
Models | No link | Current PSA | PSA slope | Area under PSA |
---|---|---|---|---|
BIC | 14350 | 14192 | 14291 | 14327 |
r (day−1) | 0.054 (1) | 0.054 (1) | 0.055 (1) | 0.054 (1) |
P
S
A
0(ng.mL−1) | 74.6 (8) | 73.9 (8) | 73.4 (8) | 74.9 (8) |
ε
| 0.35 (5) | 0.34 (5) | 0.35 (5) | 0.35 (5) |
T
esc
(day) | 138 (4) | 138 (4) | 142 (4) | 136 (4) |
λ (day) | 885 (4) | 3800 (9) | 1500 (9) | 1410 (13) |
k
| 1.52 (3) | 1.19 (1) | 1.33 (9) | 1.15 (7) |
β
| - | 0.32 (4) | 100 (10) | 0.00025 (20) |
ω
r
| 0.098 (5) | 0.098 (4) | 0.11 (5) | 0.10 (5) |
\(\omega _{PSA_{0}}\)
| 1.57 (4) | 1.57 (4) | 1.55 (4) | 1.56 (4) |
ω
ε
| 1.35 (5) | 1.34 (5) | 1.22 (5) | 1.36 (5) |
\(\omega _{T_{esc}}\)
| 0.68 (5) | 0.64 (5) | 0.63 (5) | 0.66 (5) |
σ
| 0.38 (1) | 0.38 (1) | 0.38 (1) | 0.38 (1) |
Simulation
Stan
and R
(codes available as Additional file 2) as explained previously in the general framework with L=200 the number of Monte Carlo samples. For each landmark s∈{0,6,12,18} months and each horizon time t>2 months, we calculate the coverage probabilities of the 95% prediction intervals for both PSA and hazard rate, i.e., the proportion of simulated patients for whom the true value of interest (either simulated PSA value or the simulated risk of death) is contained in the corresponding Monte Carlo 95% prediction interval.timeROC
of R
[25] we estimate time-dependent AUC, BS and sBS for each landmark time s and horizon time t, using the Kaplan-Meier estimates computed in the N
sim
=200 simulated patients themselves to calculate B
S
KM
(s,t).Real data
Results
Reference nonlinear joint model
Simulated data
Real data
Discussion
Stan
to characterize the full a posteriori distribution of the individual random effects. Of note softwares for nonlinear mixed-effects models (R, SAS or more specifically Monolix or Nonmem in pharmacometrics) can also produce individual “posthoc” parameters, typically the mode (or the mean) of the conditional distribution of the random effects. Yet, in clinical practice, having only the most likely value of the prediction does not account for the uncertainty on the individual parameter estimates. In order to characterize the prediction interval, one frequent approach is to use asymptotic Gaussian approximations [29]. However this may not always be accurate, for instance when the data are limited and additionally it does not take into account the correlations between the parameters. We show by simulation that using HMC implemented in Stan
provides good coverage probabilities, except for long follow-up where the prediction interval tended to be overoptimistic (s=18 months, see Fig. 1). Whether this is specific to this simulation framework or is a more general pattern will need to be verified. Likewise a formal comparison between HMC and traditional MCMC methods in context of individual dynamic prediction using nonlinear joint model could be of interest. In terms of model prediction assessment, the AUC and BS metrics are model-free and thus can be applied to a nonlinear context using existing packages [25]. Here, while the AUC and the BS improve over the landmark time in the simulation study, they tend to stagnate in the real data. This is likely due to the fact that in the simulation the amount of data increases linearly with the landmark time (since we assume measurements every 3 weeks), while in the real data PSA measurements become less frequent over time in patients after the end of treatment.Stan
[30], could be relevant. Further, the biological model, albeit nonlinear, remains very simplistic. For instance effect of covariates like age could be investigated on the longitudinal process. Moreover the model does not account for the mechanisms leading to resistance and then relapse to treatment that we identified previously [26]. Rather we assume that PSA kinetics and risk of death are not modified after treatment cessation and continue at the same pace than before. Moreover PSA kinetics only was assumed to drive the complex process leading to death. These simplifications may explain in part why the model is good at identifying patients at higher risk but does less well at predicting the time-to-death.Conclusion
Stan
software for stiff ODEs will make possible to use more mechanistic joint models naturally integrating the correlation between several longitudinal processes. Thus, the development of nonlinear models that will accompany the collection of new biomarkers in routine [31] may be an important step towards a better prediction of the risk of death and improve the early identification of patients at greater risk.Acknowledgements
Stan
and Hervé Le Nagard and François Cohen for the use of the computer cluster services hosted on the “Centre de biomodélisation UMR1137”.