Age- and sex-dependency of NfL and tau serum levels were investigated using linear regression with a fractional polynomial for age [
27]. In all other regression analyses, NfL and tau measurements were included as independent variables and were internally standardized according to the collected data, i.e. any effect estimate refers to an increase of one standard deviation (SD). Information on comorbidities (any histories of myocardial infarction, heart failure, vascular surgery, pacemaker, cancer, diabetes, joint replacement, or trauma with loss of function based on medical records) was taken into account in some analyses. Associations between any of the biomarkers and all-cause mortality were investigated using Cox regression models (adjusted for age, sex, years of education, and presence of any comorbidities). Subgroup analyses by sex were conducted according to Figueiras and colleagues [
28]. Missing values in MRI measurements (covariates in the Cox models) were imputed (R package ‘mice’ [
29] version 3.8.0) with 100 multiple imputations. Under the missing at random assumption, multiple imputation is typically more efficient than complete-case analysis (i.e. smaller standard errors), because it uses also information in incomplete cases. The imputation model included the event indicator and the Nelson-Aalen estimator of the cumulative hazard rate [
30,
31], all variables that appear in the Cox regression models plus disease history (blood pressure, diabetes, surgeries of the brain, heart, or hip) and general health behaviour (smoking, alcohol consumption) as auxiliary variables. The fraction of incomplete cases among the observed, which is an estimate for the gain in precision when comparing multiple imputation to complete-case analysis [
32], was 18% for each MRI variable. To assess the prognostic prediction performance of the biomarkers, concordance statistics (c statistic) [
33] based on the predicted 5-year survival probabilities (estimated with 10-fold cross-validation) were calculated in each imputed dataset. The point estimate of the pooled c statistic was calculated as the average of the 100 individual estimates. The standard error was computed using both the within and between imputation variance of the c statistic. This way, the c statistic was calculated for different models: The initial model contained only terms for age at baseline, sex, and years of education. Eight further models were constructed by adding either a neuropsychological test score (memory, speed, motor, or CASCADE score), the atrophy score, the total volume of subcortical WML, or a biomarker to the initial model. The c statistic refers to the ability of a model to distinguish an individual with the endpoint (dead) from an individual without (alive). The c statistic indicates the probability that among two individuals, one dying within 5 years and one surviving, the individual bound to die will have a higher predicted risk than the surviving individual will [
34]. Jack-knife estimation [
35] was used to assess the c statistic improvement between nested models (R package ‘validstats’ [
36] version 1.4). Whenever there was an improvement in the c statistic, the respective model was expanded by adding another predictor, e.g. by adding NfL to a model containing age, sex, years of education, and the CASCADE score. This stepwise approach was done because when statistical procedures are used to test for incremental prognostic information, the new biomarker should be tested for significance only after all other predictors have already been included in the model [
37]. In the example above, the test of interest is whether NfL adds significantly to a model that already includes age, sex, years of education, and the CASCADE score, not whether the CASCADE score is chosen before NfL in a stepwise variable-selection process.
In a sensitivity analysis, the initial model contained comorbidities in addition to age, sex, and years of education. We also repeated these steps using stroke mortality as outcome to investigate the biomarkers’ prognostic value for death due to neurological disease. Stroke was the only neurological cause of death that had a reasonably high number of events. Only three deaths from dementia occurred in the MEMO study.
Associations between the biomarkers and the neuropsychological test scores were investigated using linear regression. Associations between the biomarkers and brain structures were investigated using logistic regression (binary outcomes were the presence of large WML and lacunar strokes), ordinal logistic regression (size of WML categorized as none, medium, or large), or linear regression (atrophy score). All regression analyses were adjusted for age, sex, years of education, and presence of any comorbidities. Missing values in MRI measurements (outcomes in these regression models) were not imputed because standard errors from multiple imputation are likely to be larger than those from complete-case analysis when only the outcome variable has missing values [
38]. Additional to logistic regression, discrimination between participants with large WML versus those without, or those with lacunar strokes versus those without was measured by the area under the curve (AUC) corrected for over-optimism with 1000 bootstrap repetitions (R package ‘rms’ [
39] version 5.1-4). The closer the AUC is to one, the better is the discrimination. Additional to linear regression, the effect size was calculated as a measure of how much variation in the outcome could be explained by the biomarker. The calculation of effect sizes was based on partial
η2. Unlike adjusted
R2, which measures the contribution of the entire model in explaining the variance, partial
η2 measures the contribution of the individual independent variable. A partial
η2 equal to or greater than 0.01 presents a small effect, equal to or greater than 0.06 presents a medium effect, and equal to or greater than 0.14 presents a strong effect [
40]. All analyses were performed with R [
41] version 3.6.1.