Our study represents the first characterization of the gut mycobiome composition of a large cohort of patients with early-stage LUAD. Additionally, we presented an innovative and non-invasive approach involving gut mycobiome-based ML classification for the convenient diagnostic screening of LUAD. A diagnostic model based on microbial OTU markers was successfully established and validated across three different regions in China.
To date, studies on the gut mycobiome have been limited to varying extents by a deficiency of appropriate detection methods (i.e., fungi are less amenable to culturing than bacteria), technical limitations, and a lack of comprehensive reference databases. However, the rapid advances in bioinformatics analysis methodology in recent years have facilitated an acceleration in the identification of fungi, which is expanding our knowledge of the fungal kingdom and the contribution of fungi to human health and disease. With respect to high-throughput sequencing, the selection of appropriate barcoding primers and amplification conditions is considered a key prerequisite. In this context, amplification and sequencing of the ITS1 (between 18S and 5.8S) and ITS2 (between 5.8S and 28S) regions is a widely adopted approach in studies of the human gut mycobiome [
21], although a consensus has yet to be reached regarding the selection of ITS sub-regions. On the basis of a survey of the relevant literature, it would appear that compared with ITS1, ITS2 is associated with less amplification and sequencing bias [
21,
34]. Consistent with this assessment, in a preliminary phase of this study, we had relatively limited success when using primers targeting ITS1 [
31]. Consequently, on the basis of these findings, in the present study, we selected primers targeting the ITS2 sub-region.
Fungi are complex organisms known to play an opportunistic role during immunosuppressive and antibiotic therapies [
18]. Fungal invasion induces the synthesis of various signaling molecules, including transforming growth factor-β, interleukin (IL)-6, IL-12, IL-23, IL-1β, and interferon-γ, which trigger Th1 and Th17 cell responses, in parallel with macrophage activation and neutrophil recruitment [
18,
35]. Inflammation induced by pathogens is a major mechanism promoting carcinogenesis [
36]. The promotion of carcinogenesis by fungal metabolites has been suggested as another major mechanism. The carcinogenic effects of acetaldehyde [
37] produced by
Candida and aflatoxin [
38] produced by
Aspergillus have been demonstrated. In our study, the intestinal fungal profiles of LUAD cases differed from those of HCs. The gut fungal diversity and richness markedly increased during the progression of LUAD, suggesting that mycobiome alterations potentially promote the pathological progression of LUAD. The predominant phyla in both patients with LUAD and HCs were Ascomycota and Basidiomycota, consistent with previously reported fungal profiles in other malignant tumor types [
39]. The abnormal changes in the abundance of Ascomycota and Basidiomycota in the LUAD group may reflect fungal dysbiosis, in line with prior reported studies on colorectal cancer and pancreatic cancer [
22,
23]. At the genus level,
Candida and
Saccharomyces were the most abundant in our cohort. Previous studies have shown that
Candida,
Saccharomyces,
Malassezia, and
Cladosporium spp. are the most prevalent fungi in the healthy human gut [
40]. However, slight variations in the dominant genera are found in different study cohorts, possibly due to sample size bias or the different geographical locations of participants. Hoffman et al. [
41] have reported that
Saccharomyces,
Candida, and
Cladosporium are the most abundant genera in healthy subjects. In a study by Nash et al. [
42],
Saccharomyces,
Malassezia, and
Candida were the most abundant genera in healthy subjects.
Candida is a prominent opportunistic fungal pathogen in humans and is involved in many other diseases, including inflammatory bowel disease (IBD) [
43,
44], alcohol-associated liver disease [
45,
46], asthma [
47], and COVID-19 [
48]. A recent study on pan-cancer mycobiomes in tumor tissues has revealed that
Candida is associated with pro-inflammatory gene expression, tumor metastasis, and poorer survival outcomes, especially for gastrointestinal cancers, indicating that the detection of
Candida may represent a novel predictive biomarker and therapeutic target [
25]. Although
Candida was the most predominant genus in this study, it was not associated with the disease phenotype. In contrast, the proportion of
Saccharomyces was significantly higher in patients with LUAD than in controls.
Saccharomyces spp., as “bakers” and “brewers” yeasts, are commonly used in food fermentation. The role of
Saccharomyces in disease is controversial. Saumya et al. [
49] have identified
Saccharomyces as the most abundant (42%) genus in patients with multiple sclerosis (MS), a chronic autoimmune disease of uncertain etiology. In addition to the increase in
Saccharomyces in patients with MS compared with the controls, it is also associated with the peripheral immune response, implying a pathogenic correlation between
Saccharomyces and MS. In contrast, Harry et al. [
44] have reported that
Saccharomyces and especially
Saccharomyces cerevisiae show a markedly decreased abundance in patients with IBD, whereas
S. cerevisiae exhibits anti-inflammatory effects involving increased secretion of IL-10. These results highlight the complexity of fungi–host interactions and the urgent need for the further exploration of their effects on health and disease.
As the gut mycobiome is a highly variable and dynamic community, limited sample sizes for disease-associated fungal taxa may not be reliable biomarkers in diagnostic applications. Therefore, in addition to analyzing changes in gut fungal composition in patients with early-stage LUAD, our study applied OTU-based gut mycobiome features to train a supervised ML model. ML refers to a wide range of algorithms that can make predictions that mimic human decisions and represents a major form of artificial intelligence [
50]. Cutting-edge computer technologies of this kind have been widely used in the healthcare field and have achieved remarkable results, such as the use of artificial intelligence image recognition technology to diagnose multiple malignant tumor patients accurately through medical images [
51‐
53] and the use of ML to predict the prognosis and survival of patients with malignant tumors [
27,
54]. However, some uncertainty exists about the diagnostic efficacy [
55]. In our study, an exploratory analysis of five commonly available supervised ML algorithms was carried out to compare the performance in predicting LUAD. The results showed that RF achieved an excellent predictive AUC of 0.9350 for distinguishing patients with early-stage LUAD from healthy subjects. Moreover, considering that gut microbiota may be influenced by diet and geography, we conducted cross-regional validation to better verify the efficacy and applicability of the models. Similar to gut bacteria, the gut mycobiome undergoes changes during the human lifetime, and the geography, dietary habits, and host factors, including sex, age, and drug use, are prominent factors that contribute to shaping the gut mycobiome composition [
56]. Yang et al. characterized gut mycobiome profiles across different regions in China, including six ethnicities at a large population scale, and accordingly found that geography and ethnicity have pronounced effects on the variations in gut fungi [
57]. In the present study, despite the confounding factors of geography and diet, all the validation cohorts showed excellent results, thereby indicating the potential significance of fungal markers in the diagnosis of LUAD and the broad applicability of our approach in different geographical regions.
The limitations of the current study include the low number of fecal samples from the Suzhou and Hainan cohorts. Self-reported drug intake may introduce a certain degree of bias. A larger sample size and stricter screening criteria in multiple centers are needed to further validate the results. In addition, further animal studies are required to verify the potential association between altered fungal diversity and tumor formation.