Model
To create a realistic data generation process, we used published information on the joint distribution of age and sex in the population, the conditional distribution of highest educational attainment by age and sex, the prevalence of mild cognitive impairment (MCI) and of dementia by age, sex, and education, and the distribution of MoCA test scores conditional on age, education, and cognitive status.
For the joint distribution of age and sex, we used data from the official reports on the Italian resident population as of January 1st, 2023 [
21]. Our population of interest consisted of all residents aged between 55 and 89, as shown in Figure
e1 in the Supplementary Information.
We extracted the proportion of residents that completed different levels of education by age group and sex from annual statistical reports (Annuario Statistico Italiano) published between 1998 and 2021 [
22]. We converted the education levels into years of education by equating primary school (or less) with 5 years, middle school with 8, vocational qualification with 11, secondary education with 13, and university education with 17. This conversion is consistent with the number of school years in the education system in Italy and with mean values computed for the individuals tabulated in Aiello et al. (2022) [
23]. As the proportion of individuals that completed a specific number of education years has changed over time due to socio-economic developments, policy changes, and modifications to the education system, we modeled the relationship between education and year of birth. For today’s elderly, relevant disparities in schooling access and education attainment existed between men and women when they were adolescents, so that we fitted local regression (loess) models by year of birth separately for men and women, as shown in Figure
e2. We used all available data points across all publication years for the age groups from 30 to 64 years. The resulting models were used to approximate the distribution of years of education by year of birth and sex. Years of birth were then converted into age values, considering that our simulation was set into 2023. This procedure provided us with the conditional distribution of highest educational attainment by age and sex.
To model the prevalence of dementia and MCI in the population of interest by age, education, and sex, we used data from an older, population-based, cross-sectional study from Northern Italy [
24], which evaluated the prevalence of dementia and of mild cognitive impairment in all persons above 60 years of age residing in two municipalities in Ravenna province. Here, we equated the category “6 or more years” of education with two completed levels of education (middle school), “4–5 years” with one completed level of education (primary school), and “1–3 years” as well as “no schooling” with zero levels. We fitted two separate Poisson regressions for the number of cases of dementia or MCI, with age (lower bound of the age interval plus 2), sex, and the number of completed levels of education as the independent variables, using the logarithm of the population size as the offset. The resulting prevalence ratios for dementia were 1.14 for each additional year of age (and thus 1.95 for each additional 5 years), 0.49 for each additional level of education, and 1.07 for female sex, while the ratios for MCI were 1.03 for each additional year of age (and thus 1.16 for each additional 5 years), 0.45 for each additional level of education, and 1.14 for female sex (see Table
e1). These ratios are consistent with widely reported doubling of dementia prevalence for every five additional years of age, as also found in a systematic review and meta-analysis [
25], and with previously reported protective effects of education. For MCI, the slower increase with age agrees with other previous findings [
25]. The fitted regressions were then used to extrapolate the prevalences for all combinations of age, sex, and number of completed levels of education. However, the resulting marginal prevalences by age or by age and sex were much lower than recently reported, by approximately a factor of 2 for dementia [
26], and by a factor of 6 for MCI [
27]. We, therefore, increased the intercept to scale up the overall prevalence by a constant factor of 2 for dementia and 6 for MCI, leading to similar prevalences compared to the literature as shown in Tables
e2 and
e3. In doing so, we assumed that the number of diagnoses increased over the past 20 + years, without meaningfully altering the underlying prevalence ratios for age, sex, and education.
Finally, we combined the official distribution of the residents’ sex and age, the conditional distribution of education levels by sex and age, as well as the conditional distribution of cognitive status by sex, age, and education level to obtain the joint distribution of sex, age, completed education, and cognitive status. In evaluating the models for the prevalence of MCI and dementia, the categories vocational training (11 years) and secondary education (13 years) from the annual reports were both mapped to three completed levels of education, and university education (17 years) to four completed levels of education. The resulting joint distribution represented our population of interest and it is reported in Table
1.
Table 1
Joint distribution of sex, age, completed education, and cognitive status (healthy, mild cognitive impairment (MCI), and dementia (Dem)) for the simulated population of Italian residents aged 55 to 89 years, as of January 1st, 2023. Each cell represents the probability (in %) of sampling an individual with a specific value of sex, age, education, and cognitive status
f | 55–59 | 0.660 | 0.100 | 0.004 | 3.955 | 0.244 | 0.012 | 0.929 | 0.025 | 0.001 | 3.491 | 0.093 | 0.005 | 1.412 | 0.017 | 0.001 |
f | 60–64 | 0.972 | 0.175 | 0.013 | 3.329 | 0.240 | 0.020 | 0.822 | 0.025 | 0.002 | 2.790 | 0.086 | 0.008 | 1.105 | 0.015 | 0.001 |
f | 65–69 | 1.537 | 0.334 | 0.043 | 2.581 | 0.219 | 0.030 | 0.620 | 0.022 | 0.003 | 1.993 | 0.072 | 0.011 | 0.937 | 0.015 | 0.002 |
f | 70–74 | 2.266 | 0.608 | 0.131 | 1.995 | 0.202 | 0.047 | 0.446 | 0.019 | 0.005 | 1.347 | 0.057 | 0.014 | 0.734 | 0.014 | 0.004 |
f | 75–79 | 2.407 | 0.817 | 0.289 | 1.386 | 0.167 | 0.064 | 0.278 | 0.014 | 0.006 | 0.834 | 0.041 | 0.017 | 0.447 | 0.010 | 0.004 |
f | 80–84 | 2.133 | 1.015 | 0.609 | 0.916 | 0.138 | 0.090 | 0.153 | 0.009 | 0.006 | 0.508 | 0.030 | 0.022 | 0.229 | 0.006 | 0.004 |
f | 85–89 | 1.115 | 0.911 | 0.903 | 0.450 | 0.088 | 0.094 | 0.055 | 0.004 | 0.005 | 0.259 | 0.019 | 0.022 | 0.066 | 0.002 | 0.002 |
m | 55–59 | 0.567 | 0.073 | 0.003 | 4.396 | 0.236 | 0.012 | 0.755 | 0.018 | 0.001 | 3.197 | 0.075 | 0.004 | 1.224 | 0.013 | 0.001 |
m | 60–64 | 0.654 | 0.101 | 0.008 | 3.557 | 0.223 | 0.019 | 0.609 | 0.016 | 0.002 | 2.695 | 0.073 | 0.007 | 1.040 | 0.012 | 0.001 |
m | 65–69 | 0.940 | 0.175 | 0.024 | 2.682 | 0.198 | 0.029 | 0.474 | 0.015 | 0.002 | 2.135 | 0.067 | 0.011 | 0.922 | 0.013 | 0.002 |
m | 70–74 | 1.392 | 0.317 | 0.073 | 2.085 | 0.183 | 0.045 | 0.371 | 0.014 | 0.004 | 1.594 | 0.059 | 0.016 | 0.773 | 0.012 | 0.004 |
m | 75–79 | 1.500 | 0.426 | 0.160 | 1.420 | 0.148 | 0.060 | 0.235 | 0.010 | 0.005 | 1.019 | 0.044 | 0.020 | 0.523 | 0.010 | 0.005 |
m | 80–84 | 1.311 | 0.509 | 0.325 | 0.876 | 0.113 | 0.078 | 0.122 | 0.006 | 0.005 | 0.563 | 0.029 | 0.022 | 0.304 | 0.007 | 0.006 |
m | 85–89 | 0.665 | 0.411 | 0.431 | 0.380 | 0.062 | 0.071 | 0.036 | 0.002 | 0.003 | 0.205 | 0.013 | 0.016 | 0.114 | 0.003 | 0.004 |
From normative studies, we identified four published models for predicting the average raw MoCA test scores for Italians without cognitive impairment [
17‐
20]:
$${\widehat{MoCA}}_{Aiello}=24.17-0.000008*\left({age}^{3}-297697.18\right)+3.331407*\left(\text{ln}\left(edu\right)-2.325648\right)$$
(1)
$${\widehat{MoCA}}_{Conti}=23.28-0.175*\left(age-70.08\right)-24.3*\left(1/edu-0.126\right)$$
(2)
$${\widehat{MoCA}}_{Santangelo}=21.98+4.228*\left({\text{log}}_{10}\left(100-age\right)-1.58\right)+3.201*\left(\sqrt{edu}-3.25\right)$$
(3)
$${\widehat{MoCA}}_{Montemurro}=25.468-0.089*\left(age-67.086\right)+0.187* \left(edu-11.245\right)$$
(4)
Since Montemurro et al. provided several models for raw scores as a function of different combinations of sex, age, and education, the parameters in Eq. (
4) were computed from fitting a linear regression on the publicly available data from their study [
20]. For patients with MCI and with dementia, the average raw MoCA scores were calculated by subtracting 5.1 and 10.7 points, respectively. These values were obtained as the rounded averages of a) differences of -5.333 ± 0.531 and -12.278 ± 0.592 between mean MoCA scores in a study in Portugal where MCI, dementia, and control subgroups were matched on age and education [
28], b) coefficients of -4.07 ± 0.63 and -9.66 ± 0.84 from the combined regression model including age and education (in addition to sex, years in the US, and primary language) in a study of monolingual Chinese Americans [
29], and c) coefficients of -5.769 ± 0.696 and -10.147 ± 0.688 from the combined regression model including only age, years of education, and clinical diagnosis in a study in Hong Kong [
30]. These effects were comparable to: mean differences of 5.44 and 8.77 found in a study in Italy, where groups were not matched by age nor education [
31] as well as mean differences of 5.20 for probable MCI patients with MMSE ≤ 23.8 compared to matched healthy controls with MMSE > 23.8 and mean differences of 9.45 or 10.55 for Dementia patients compared to two different groups of matched healthy controls, in a small study in Italy [
32].
Raw MoCA scores were assigned to each person by using the mean MoCA test score given their age, education, and cognitive status, as obtained from Eq. (
1) for the main analysis, plus a normally distributed error representing the influence of unobserved variables. We assumed errors to be independent and identically distributed with a mean of 0 and a standard deviation of 2.9 (standard deviation of the residuals obtained after fitting the regression model to the original data in Aiello et al. [
23]).
Simulation and data analysis
Using the data generation mechanism described above, we simulated a development sample of 5,000 persons and a separate validation sample of 50,000 persons. All individuals were independently drawn from a near-infinite super-population with the joint distribution of sex, age-group, education, and cognitive status shown in Table
1. Age was then assigned as a continuous value uniformly randomly drawn within the respective age-group limits, and raw MoCA scores were generated according to Eq. (
1).
We then fitted a regression with the raw MoCA test scores as the dependent variable and age and education as the independent variables only among the “healthy” individuals (without cognitive impairment) from the development sample, using the same terms for age and education as in Eq. (
1). Using the resulting predictions (
\({\widehat{MoCA}}_{prediction}\)) and the intercept of the model (
\({\beta }_{0}\)), we computed corrected scores for all individuals according to Eq. (
5), without rounding or clipping. This approach is traditionally used to correct Italian neuropsychological tests [
33,
34], and commonly employed to correct MoCA scores for age and education [
17‐
19]. The corrected score for an individual
A is the difference between (i) the observed raw score for individual
A and (ii) the expected raw score for a healthy individual of same age and education as individual
A, plus (iii) a constant to ensure that the mean score for the population of healthy individuals remains unchanged.
$${Corrected}_{A}={Raw}_{A}- {\widehat{MoCA}}_{prediction}\left(age={age}_{A}, education={education}_{A}\right)+{\beta }_{0}$$
(5)
This approach results in the same AUC, sensitivity, and specificity that would be obtained using the common Z-score correction [
5], as these metrics are invariant to additive shifts and the standard deviation of the residuals is constant (so that dividing by it would not alter ranks) [
15].
We then evaluated the overall discrimination performance in the validation sample, estimating the AUC, for both raw scores and for corrected scores, and the AUC difference. AUC values were computed to measure the discrimination performance for distinguishing individuals with cognitive impairment (MCI or dementia) from those without, and separately for distinguishing individuals with MCI from those without cognitive impairment (excluding patients with dementia). As the latter task reflects most closely the scenario of screening previously undiagnosed persons for cognitive impairment, sensitivity and specificity were estimated for this contrast only.
To compute sensitivity and specificity, it is necessary to determine cutoffs a priori for corrected and for raw scores. We determined these cutoffs marginally, i.e., based on their performance on the overall population, in the development sample. This choice was motivated by the fact that raw scores and corrected scores are expected to lead to identical sensitivities and specificities if individual cutoffs were instead determined for each age-education group (which would implicitly correct for age and education).
Many options exist to select cutoffs for cognitive screening tests. It is common practice to choose the cutoff for corrected scores by considering a sample of “healthy” individuals with the same demographic characteristics and selecting the value corresponding to the mean score minus 1 or 2 standard deviations. Under the assumption of normality and constant variance of the residuals, this technique is equivalent to choosing the cutoffs corresponding to a specificity of 84.1% and 97.7%. Coherently, we determined the cutoffs in the development sample as the values that ensured a marginal preselected specificity of either 97.7% or 84.1%. In view of the importance of sensitivity in a screening setting as also highlighted in [
35], a third cutoff was determined from the scores of the MCI patients in the development sample, preselecting a sensitivity of 84.1%. Sensitivity and specificity were calculated in the validation sample, both marginally and by age-education group, separately for raw scores and for corrected scores.
The above simulation procedure was repeated 10,000 times, and means were computed together with 2.5th and 97.5th percentiles for all metrics. In the main analysis, MoCA scores were generated and corrected according to Eq. (
1) and neither rounded nor clipped. To assess the robustness of the results, we conducted sensitivity analyses, in which MoCA scores were computed using each of the Eqs. (
1) to (
4), rounded to the nearest integer, and truncated to the valid range of test scores, i.e., [0,30] points. Furthermore, we repeated the main analysis using a larger standard deviation of 3.4 for the residuals.
All simulations and evaluations were performed using RStudio version 2024.04.1 Build 748 [
36], R version 4.4.0 [
37], and R package pROC version 1.18.4 [
38].