Introduction

Cognitive screening/first-level tests allow an estimate of global efficiency/functioning by adequately balancing between informativity and practicality of usage [1]. Compared to screening tests for dementia [2], those aimed at detecting mild-to-moderate cognitive impairment [3] may be harder for practitioners to interpret because of (a) the magnitude of the target construct (i.e., the deficit) being less obvious and (b) the amount of information provided by the test being limited [4]. Fine-grained, adaptive psychometric approaches can thus help solve interpretation issues to facilitate diagnostic processes by magnifying informativity [5, 6].

The Montreal Cognitive Assessment (MoCA) [7] is one of the most widespread and psychometrically robust screening tools for cognitive impairments of graded severity [8]. The MoCA is a rapid (5–10’) screening test which evaluates both non-instrumental (executive functioning, attention) and instrumental (language, memory, visuo-spatial abilities, orientation) domains.

In Italy, the MoCA has been adapted and standardized—and both its statistical properties and clinical usability thoroughly examined [9,10,11,12].

Psychometric investigations on the MoCA have been carried out both at the sub-test and the single-item levels [13, 14]. A widespread approach that allows a flexible use of cognitive screening tests [15] is to provide norms for their domain-specific sub-tests [10]. Moreover, information regarding single items can further help practitioners interpret test scores by qualitatively assigning different weights to different items [16]. To this last end, Item Response Theory (IRT) analyses [17] have been conducted on MoCA items to assess both their sensitivity and discriminative capability [18,19,20,21]. IRT-based analyses indeed proved to yield relevant insights to performance interpretations; for instance, executive- and memory-related items were often shown to be highly informative [18, 19].

Further improvements to adaptive testing may come from deriving norms that account for inter-regional socio-demographic heterogeneity [22]. Cultural differences within a same country have been indeed highlighted as a relevant confounding predictor when interpreting test scores [23].

Therefore, providing region-/culture-specific psychometric fine-grained outcomes and normative data can ameliorate I-level cognitive testing in both clinical and research contexts [24].

It is furthermore worth highlighting that rapid socio-demographic changes may pose additional challenges to practitioners when drawing up-to-date clinical inferences since norms need to be frequently renewed [25].

The present study thus aimed at: (i) providing updated, region-specific normative data for the Italian MoCA and its sub-tests; (ii) comparing existing norms for the MoCA in the Italian population to those drawn from a region-specific Italian sample; (iii) providing IRT-based information regarding sensitivity and discriminative capability of MoCA items in an Italian population sample.

Methods

Participants

Five hundred and seventy nine healthy Italian native speakers were recruited in Lombardy, Northern Italy. Exclusion criteria were: (a) a confirmed diagnosis of neurological or psychiatric disorders; (b) general medical conditions possibly affecting cognition (i.e., non-compensated and/or severe metabolic/internal morbidities and systemic/organ failures); (c) intake of psychotropic drugs. Participants suffering from well-compensated metabolic/internal conditions were included [9, 10]. Participants had normal or corrected-to-normal vision and/or hearing. Sample stratification is reported in Table 1. Data were derived from three different normative studies where the MoCA was administered cognitive screening aims; the MoCA was administered as the first test in every study, adopting the same procedure (as detailed below), the same sampling criteria (as detailed above) and geographical coverage. All of these studies were approved by the Research Evaluation Committee of the Department of Psychology of University of Milano-Bicocca on behalf of the Ethical Committee of the same Institution. Participants provided informed consent and signed a data treatment disclaimer for research purposes.

Table 1 Sample stratification for age, education and sex

Materials

The Italian version of the MoCA was administered to all participants [26]. Items were grouped as follows: Executive Functioning (EF): Trail-Making B (TMT), phonemic fluency and verbal abstraction tasks; Attention (A): serial backward subtraction, letter detection by tapping and forward/backward digit span tasks; Language (L): confrontation naming and sentence repetition ; Visuo-spatial (VS): three-dimension cube copy and Clock Drawing task (CDT); Orientation (O) and Memory (M): spatio-temporal orientation and delayed recall (DR) items, respectively [9, 10].

Statistical analyses

Normality checks on raw variables were performed descriptively, by evaluating skewness and kurtosis values, and graphically, by visually inspecting histograms and quantile-quantile plots) [27, 28]. Between-variables associations were thus tested via either parametric (Pearson’s) or non-parametric (Spearman’s) techniques. Sex differences were tested via independent sample t tests.

MoCA reliability was assessed via an internal consistency analysis (Cronbach’s α), whereas construct validity by means of a Principal Component Analysis (PCA). Single-item-level analyses were performed by applying a two-parameter logistic IRT model for dichotomous outcomes via the R package ltm [29]; item difficulty and discrimination were thus computed [17, 30, 31]. Higher values of both parameters correspond to higher levels of the target construct. Cognitive efficiency was regarded as the latent trait.

Regression-based norms were derived via the Equivalent Scores (ESs) method [32, 33]; outer and inner tolerance limits (oTL and iTL, respectively) as well as ESs threshold were computed. Average ESs (AESs) [34] were also calculated by averaging ESs of each sub-test to provide a standardized across-domain global index.

Agreement between the present ES classification and those from previous normative studies [9, 10] was tested by crossing level of abilities via Cohen’s k.

Analyses regarding MoCA total scores were performed on the whole sample, whereas those for single sub-tests and items were conducted on N = 535 participants only due to imputation issues.

Statistical power was computed a posteriori based on the final multiple regression model (dfnumerator = 3) [35] on MoCA total scores via the R package pwr [36]—according to previous normative studies [37, 38] and by taking into account α = 0.05 and f2 derived from fit measures.

Analyses were performed via SPSS 27 [39] and R 3.6.3 [40]. ES-related procedures were carried out according to guidelines reported by Aiello and Depaoli [41].

Results

Participants’ demographics and MoCA scores (M ± SD and range) are reported in Table 2.

Table 2 Participants’ demographics and cognitive variables

Age proved to be inversely related to both total (Spearman’s rs(579) = − 0.57; p < 0.001) and sub-test (−0.46 ≤ rs(535) ≤ −0.11; .014 ≤ p < 0.001) MoCA scores, whereas a positive association with education was found for all measures: MoCA total (rs(579) = 0.55; p < 0.001) and sub-test (0.15 ≤ rs(535) ≤ 0.53; p≤.001) scores. Sex differences were detected with respect to MoCA-A (t(441.8) = 2.42; p = 0.021; males: 5.52±.81; females: 5.33 ± 0.95), -L (t(482.98) = 2.96; p = 0.003; males: 4.6 ± 0.6; females: 4.42 ± 0.79) and -VS (t(533) = 2.12; p=.034; males: 3.22 ± 0.92; females: 3.03 ± 0.98) scores. Moreover, males (24.57±3.47) scored slightly higher (t(494.4) = 1.96; p = 0.05) than females (23.94 ± 4.15) on the MoCA-total. However, when simultaneously tested, only age and education proved to be significantly predictive of all MoCA measures (age: |0.19| ≤ β ≤ |0.38|; p < 0.001; education: |0.16| ≤ β ≤ |0.42|; p < 0.001); however, MoCA-O was found to be predicted by age only (β = 0.19; p < 0.001). Achieved power was estimated at 1−β ≈ 1, with an effect size f2 = R2/(1−R2) = 0.45/(1−0.45)  = 0.82.

Adjustment equations and grids as well as TLs and ESs thresholds are reported in Tables 3 and 4, respectively. Since both MoCA-M TLs corresponded to negative values, the observation corresponding to the first positive adjusted score was regarded as an empirical iTL (yielding a p > 0.99 that 95% of the population performs above it). No adjusted score was thus classified as ES = 0.

Table 3 Adjustment grids according to age and education for MoCA total and sub-test raw scores
Table 4 Equivalent Scores for MoCA total and sub-test adjusted scores

AESs proved to be independent from sex (t(533) = 1.8; p=0.073), age (r(535)=0.07; p=0.119) and education (r(535) = 0.03; p = 0.44).

Weak agreement (0.17 ≤ k ≤ 0.57) [42] was detected between the present and both

Conti et al.’s [9] and Santangelo et al.’s [10] ES classifications (see Table 5). More specifically, ESs allotments here reported proved to be stricter than those of Santangelo et al.’s [10] with regard to MoCA-total, -VS, -EF and -A, whereas less strict with regard to and -O and Conti et al.’s [9] total.

Table 5 Comparison between Equivalent Scores classifications

As regards item-level analyses, the MoCA proved to be internally consistent (Cronbach’s α = 0.81).

A mono-component factor (15.9% of variance explained) structure was selected from PCA, with the majority of items highly loading (0.3 ≤ r ≤ 0.55), except for N = 8 items (CDT contour, digit span backward, lion and camel naming and all MoCA-O items except for year; .02 ≤ r ≤ 0.26).

Item difficulty and discrimination values are displayed in Table 6. The most difficult items proved to be the three-dimension cube copy, CDT hands, repetition of the second sentence, phonemic fluency, the second verbal abstraction item and DR items. The least difficult ones were CDT contour, lion-naming, the letter detection task and month, place and city items of MoCA-O. TMT, repetition of the first sentence, DR items and year and city of MoCA-O proved to be the most effective in discriminating between different levels of ability, whilst those with the lowest values of discrimination were place of MoCA-O and the letter detection task.

Table 6 Item difficulty and discrimination for the MoCA

Discussion

The present work provides Italian practitioners with updated, region-specific normative data for the MoCA, as well as with IRT-based, item-level information that may allow a more flexible and informative use of this screening instrument.

Although norms for the Italian MoCA have been provided in previous studies [9, 10], recent changes in demographic composition and socio-cultural features of Italian population motivated the normative branch of this study. Moreover, the present sample covers wider ranges of age and education and is larger (N = 579; age: 21–96; education: 1–25) when compared to previous normative studies - Conti et al. [9]: N = 225; age: 60–80; education: 5–23; Santangelo et al. [10]: N = 415; age: 21–95; education: 1–21. Norms here reported are thus likely to be more representative and generalizable as far as sample size and coverage of anagraphic–demographic variables are concerned.

Moreover, the oTL for MoCA-M had not been provided by Santangelo et al. [10] because it corresponded to a negative adjusted score. Nonetheless, despite this finding having been replicated also in the present study, an empirical iTL for MoCA-M has been with provided, along with ESs thresholds (which, however, did not correspond to negative adjusted scores). Although caution is needed when interpreting this iTL, its practical use is quite intuitive. For instance, only for young and highly educated individuals a raw score of 1 would be classified as below the aforementioned iTL. Thereupon, practitioners would not be allowed to judge that a score below the MoCA-M iTL falls in the worst 5% of the population, although it would be possible to say that 99% of healthy individuals perform above it.

With respect to anagraphic–demographic predictors, MoCA-O scores proved not to be influenced by education in the present study. This finding diverges from previous ones regarding not only the MoCA [10], but also other cognitive screening tests [43, 44]. Similarly, although males were found as performing better than females on MoCA-A and -VS, when sex was tested individually, no such differences have been yielded from models additionally accounting for age and education, contrarily to Santangelo et al.’s [10] study. This finding was also true for MoCA-L, although it has not been previously reported [10]. This discrepancy may be attributed to age/education voiding sex differences in this larger sample, and it is in line with inconsistent findings in concerning literature [45].

Along with the above inconsistencies regarding anagraphic–demographic variables, the fact that the present cut-off thresholds happened to systematically diverge from those of Conti et al. [9] and Santangelo et al. [10] is suggestive of relevant inter-regional differences that should be taken into consideration by Northern Italian practitioners [46]. It is noteworthy that this last aspect has been recently addressed in Italy with respect to the Mini-Mental State Examination [47], for whom region-specific norms have been recently provided for Southern Italian individuals.

Major contributions to an adaptive interpretation [48, 49] of the Italian MoCA also come from single-item-level analyses, which indicate the need to pay particular attention to highly discriminative items when specificity has to be favored, and to highly difficult ones when sensitivity does. Of relevance, despite cultural/language differences [24], the present findings are in line with previous ones from eastern countries with regard to the high discriminative capability of MoCA-EF and -M items [18, 19].

This work has a main limitation that needs consideration: a different cognitive screening test was not administered since it was out of the present aims to assess concurrent/convergent validity of the MoCA. However, due to the lack of such data, it is not possible to rule out sub-clinical cognitive deficits in participants. It is also noteworthy that item- and sub-test-level analyses were performed on a smaller sample (535 out of 579 participants) due to completely-at-random missing values [50].

In conclusion, the present study and its results favor a more informative and flexible use, scoring and interpretation of the Italian MoCA by providing updated and region-specific normative data at the sub-test level, also comprising a proxy cut-off for MoCA-M scores; moreover, novel information on sensitivity and discriminative capability of single Italian MoCA items have been provided.