In our controlled study including 110 participants with a median clinical follow-up reaching 11 months, the performance of five machine learning procedures with 100-time nested cross-validation were compared to accurately classify IPD from control. Our results demonstrated that among different well-grounded standard PET metrics, visual assessment and semi-quantitative min SUV ratio provided similar performance. In DAT SPECT, despite improvements with CZT systems, the spatial resolution and sensitivity detection remains lower than PET-CT systems, together with the lack of anatomical repairs for accurate realignment of the brain motivated the used of semi-quantitative ratios to limit inter reader variabilities. Although semi-quantification showed increased diagnostic performance in few SPECT studies [
31‐
33], such approach does not preclude from interobserver variations if performed manually or semi-automatically, and is still considered as an adjunct to visual analysis. Although aromatic
l-amino acid decarboxylase (AADC) striatal deficiency can be quantified by PET dynamic acquisitions in Parkinsonian syndromes [
34], practical consideration and similar diagnostic performance in cross-sectional studies have led to promote simpler SUV ratios. DAT SPECT and
18F-DOPA PET visual interpretation share similar features but have two different targets in dopaminergic neurotransmission that can decrease in parallel but not necessarily synchronously with progression of neurodegenerative Parkinsonism. In theory, in early IPD compensatory mechanisms trigger presynaptic DATs expression is downregulation, and
l-amino acid decarboxylase upregulation which is was confirmed in a meta-analysis [
22,
23,
25,
35] demonstrating that AADC defect seen with
18F-DOPA PET is consistently smaller than DAT defects in SPECT studies. Nevertheless both are able to diagnose presynaptic dopaminergic deficits in early phases of PD with excellent sensitivity and specificity [
11], to our knowledge no study showed superiority of any procedure and compared to SPECT,
18F-DOPA PET provides higher detection sensitivity and spatial resolution, thus allowing very thinner visual assessment of deep brain structures involvement. In this context, and except for particular follow-up study purpose [
36‐
39], justifying semi-quantification or more sophisticated methods over visual assessment to diagnose Parkinsonism remains largely under evaluated in clinical practice. In the era of precision medicine, the search for new powerful and robust imaging biomarkers constitutes a very hot topic of interest in a wide variety of diseases. In this context, automated image processing workflows have emerged, and could facilitate the interpretation of physicians in daily practice. Motivated by its rational in high spatially resolved morphological imaging, radiomics gradually invades nuclear imaging of oncological and non-oncological diseases, including Parkinsonian syndromes [
40‐
44]. In their very recent paper, Comte et al. trained and validated a logistic regression model with L1 regularization to identify dopaminergic denervation on
18F-DOPA PET/CT [
44]. Among 43 first and higher-orders parameters, three textural features were found to identify abnormal
18F-DOPA PET almost as well as a nuclear imaging expert, considered here as the gold standard and study outcome. As mentioned by the authors, the clinical utility of such approach remains unknown, also questioning its conceptual diagnostic relevance in this particular topic, given the limited semiology of
18F-DOPA PET pattern abnormalities, and the well-known major logistical drawbacks of handcrafted radiomic pipelines in real-life practice. Recently, the clinical utility of deep-learning based methods to identify Parkinson disease directly from PET data has been emphasized, with very promising results [
45‐
47]. Deep learning conceptually tackles all the limitations of handcrafted radiomics procedures and would probably constitute a more powerful and efficient alternative to human expert reading for basic imaging identification tasks. In this way, capturing the objective min SUVr from striata, which was here as relevant as visual expert assessment, appears a promising way toward simple assisted analysis workflows. However, multicentric studies are mandatory to overcome the lack of reproducibility of standard PET semi-quantitative metrics related to PET systems image reconstruction properties (the well-known center effect). In accordance with the recent EANM guidelines [
11] visual analysis remains to date the most relevant parameter to predict IPD.
Our study has several limitations. Firstly, the median follow-up of 11 months could have led to potential diagnostic misclassification [
48]. Because the new movement disorder society clinical diagnostic criteria [
6] are currently being judged not useful by the experts in real-life practice [
5] trained neurologists typically make the diagnosis on medical history, clinical symptoms (bradykinesia, rigidity, tremor) and symptoms evolution under treatment. In atypical cases, staff of experts make their conclusions on a full multidisciplinary work-up. Second, our study included outpatients with mild -early symptoms mainly, for whom clinical diagnosis was ambiguous, justifying the
18F-DOPA imaging (in the case of cardinal symptoms, in particular at advanced stages, dopaminergic imaging has no clinical relevance). Because our patients did not have clinical confirmation yet at the time of
18F-DOPA PET imaging, the severity score of IPD was not provided at this time, emphasizing the real-life practice conditions of our study. To note, this study was not designed for the PET assessment of disease clinical severity, which is out of the scope of this study and has been widely studied by the past. Third, from hundreds of quantitative morphological and metabolic measures available with Freesurfer and Petsurfer neuroimaging pipeline, only few semi-quantitative metabolic PET metrics were considered clinically relevant given the physiopathology of dopaminergic denervation of striate: SUV metabolic ratios, gradient, and asymmetry indices. This choice was motivated by the fact that (1) we wanted to compare different well-known and usable PET metrics that are easily applicable in routine practice, and more importantly transposable to the individual level which contrasts with recent radiomics studies published; (2) making statistical inference with less than 10 participants per parameter becomes conceptually unacceptable [
49,
50]. Forth, the selected semi-quantitative features were competing with the median performance of five experts that blindly and independently reviewed each case to limit inter-readers potential heterogeneity. One strength of future machine learning model in this context could be the reproducibility of predictions compared to a single expert reading. The main asset in our study is the use of simultaneous PET/MRI acquisition, which improves striatal segmentation from T1-weighted sequence over PET images [
51]. Nevertheless, the potential of MRI cannot be restrained to morphological analysis. Promising results showed high correlation between Parkinson disease and specific MRI-multiparametric brainstem investigation [
52], in particular the iron deposit in the substantia nigra. Recent results also showed correlation between iron deposit in substantia nigra and striatal dopamine denervation seen with
18F-DOPA PET [
53,
54]. All these results may promote further research to better capture the relevance of combining
18F-DOPA and MRI capabilities in this field.