Skip to main content
Erschienen in: Journal of Cancer Research and Clinical Oncology 7/2023

26.08.2022 | Research

Classification prediction of early pulmonary nodes based on weighted gene correlation network analysis and machine learning

verfasst von: Guang Li, Meng Yang, Longke Ran, Fu Jin

Erschienen in: Journal of Cancer Research and Clinical Oncology | Ausgabe 7/2023

Einloggen, um Zugang zu erhalten

Abstract

Objective

To use weighted gene correlation network analysis (WGCNA) and machine learning algorithm to predict classification of early pulmonary nodes with public databases.

Methods

The expression data and clinical data of lung cancer patients were firstly extracted from public database (GTEx and TCGA) to study the differentially expressed genes (DEGs) of lung adenocarcinoma (LUAD). The intersection of three R packages (Dseq2, Limma, EdgeR) methods were selected as candidate DEGs for further study. WGCNA was used to obtain relevant modules and key genes of lung cancer classification, GO and KEGG enrichment analysis was performed. The model was built using two machine learning methods, Least Absolute Shrinkage and Selection Operator (LASSO) regression and tumor classification was also predicted with extreme Gradient Boosting (XGBoost) algorithm.

Results

DEGs analysis revealed that there were 1306 LUAD genes. WGCNA module analysis showed that a total of 116 genes were significantly related to classification, and module genes were mainly related to 14 KEGG pathways. The machine learning algorithm identified 10 target genes by LASSO regression analysis of differential genes, and 18 genes were identified by XGBoost model. A total of 6 genes were found from the intersection of the above methods as classification signatures of early pulmonary nodules, including “HMGB3” “ARHGAP6” “TCF21” “FCN3” “COL6A6” “GOLM1”.

Conclusion

Using DEGs analysis, WGCNA method and machine learning algorithm, six gene signatures related to early stage of LUAD, which can assist clinicians in disease classification prediction.
Literatur
Zurück zum Zitat Alaei S, Sadeghi B, Najafi A et al (2019) LncRNA and mRNA integration network reconstruction reveals novel key regulators in esophageal squamous-cell carcinoma. Genomics 111(1):76–89CrossRefPubMed Alaei S, Sadeghi B, Najafi A et al (2019) LncRNA and mRNA integration network reconstruction reveals novel key regulators in esophageal squamous-cell carcinoma. Genomics 111(1):76–89CrossRefPubMed
Zurück zum Zitat Alizadeh AA, Eisen MB, Davis RE et al (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503CrossRefPubMed Alizadeh AA, Eisen MB, Davis RE et al (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503CrossRefPubMed
Zurück zum Zitat Andrea F, Damian S, Sune F et al (2012) STRING v9.1: protein–protein interaction networks, with increased coverage and integration. Nucl Acids Res 41(1):D808–D815 Andrea F, Damian S, Sune F et al (2012) STRING v9.1: protein–protein interaction networks, with increased coverage and integration. Nucl Acids Res 41(1):D808–D815
Zurück zum Zitat Aruna LML (2018) Overexpression of golgi membrane protein 1 promotes non-small-cell carcinoma aggressiveness by regulating the matrix metallopeptidase 13. Am J Cancer Res 8(3):551–565PubMedPubMedCentral Aruna LML (2018) Overexpression of golgi membrane protein 1 promotes non-small-cell carcinoma aggressiveness by regulating the matrix metallopeptidase 13. Am J Cancer Res 8(3):551–565PubMedPubMedCentral
Zurück zum Zitat Chen T, Tong H, Benesty M (2016) xgboost: Extreme Gradient Boosting Chen T, Tong H, Benesty M (2016) xgboost: Extreme Gradient Boosting
Zurück zum Zitat Dibley MJ, Staehling N, Nieburg P et al (1987) Interpretation of Z-score anthropometric indicators derived from the international growth reference. Am J Clin Nutr 46(5):749–762CrossRefPubMed Dibley MJ, Staehling N, Nieburg P et al (1987) Interpretation of Z-score anthropometric indicators derived from the international growth reference. Am J Clin Nutr 46(5):749–762CrossRefPubMed
Zurück zum Zitat Goldberg SI, Niemierko A, Turchin A (2008) Analysis of data errors in clinical research databases. In: AMIA symposium. American Medical Informatics Association Goldberg SI, Niemierko A, Turchin A (2008) Analysis of data errors in clinical research databases. In: AMIA symposium. American Medical Informatics Association
Zurück zum Zitat Guo X, Wei Y, Zhe W et al (2018) LncRNA LINC00163 upregulation suppresses lung cancer development though transcriptionally increasing TCF21 expression. Am J Cancer Res 8(12):2494–2506PubMedPubMedCentral Guo X, Wei Y, Zhe W et al (2018) LncRNA LINC00163 upregulation suppresses lung cancer development though transcriptionally increasing TCF21 expression. Am J Cancer Res 8(12):2494–2506PubMedPubMedCentral
Zurück zum Zitat Juarez-Flores A, Zamudio GS, José MV (2021) Novel gene signatures for stage classification of the squamous cell carcinoma of the lung. Sci Rep 11(4835):1–10 Juarez-Flores A, Zamudio GS, José MV (2021) Novel gene signatures for stage classification of the squamous cell carcinoma of the lung. Sci Rep 11(4835):1–10
Zurück zum Zitat Keen JC, Moore HM (2015) The Genotype-Tissue Expression (GTEx) Project: linking clinical data with molecular analysis to advance personalized medicine. J Personal Med 5(1):22–29CrossRef Keen JC, Moore HM (2015) The Genotype-Tissue Expression (GTEx) Project: linking clinical data with molecular analysis to advance personalized medicine. J Personal Med 5(1):22–29CrossRef
Zurück zum Zitat Kim A, Sun ML, Kim JH et al (2021) Integrative genomic and transcriptomic analyses of tumor suppressor genes and their role on tumor microenvironment and immunity in lung squamous cell carcinoma. Front Immunol 12:598671CrossRefPubMedPubMedCentral Kim A, Sun ML, Kim JH et al (2021) Integrative genomic and transcriptomic analyses of tumor suppressor genes and their role on tumor microenvironment and immunity in lung squamous cell carcinoma. Front Immunol 12:598671CrossRefPubMedPubMedCentral
Zurück zum Zitat Kolde R (2015) pheatmap: Pretty Heatmaps Kolde R (2015) pheatmap: Pretty Heatmaps
Zurück zum Zitat Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinf 9(1):559CrossRef Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinf 9(1):559CrossRef
Zurück zum Zitat Liu J, Wang L, Li X (2018) HMGB3 promotes the proliferation and metastasis of glioblastoma and is negatively regulated by miR-200b-3p and miR-200c-3p. Cell Biochem Funct 36(7):357–365CrossRefPubMed Liu J, Wang L, Li X (2018) HMGB3 promotes the proliferation and metastasis of glioblastoma and is negatively regulated by miR-200b-3p and miR-200c-3p. Cell Biochem Funct 36(7):357–365CrossRefPubMed
Zurück zum Zitat Love M, Anders S, Huber W (2014) Differential analysis of count data—the deseq2 package Love M, Anders S, Huber W (2014) Differential analysis of count data—the deseq2 package
Zurück zum Zitat Ma Y, Qiu M, Guo H et al (2021) Comprehensive analysis of the immune and prognostic implication of COL6A6 in lung adenocarcinoma. Front Oncol 11:235 Ma Y, Qiu M, Guo H et al (2021) Comprehensive analysis of the immune and prognostic implication of COL6A6 in lung adenocarcinoma. Front Oncol 11:235
Zurück zum Zitat Mclendon R, Friedman A, Bigner D, Meir E, Meir EV, Brat D et al (2008) The Cancer Genome Atlas (TCGA), comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061–1068CrossRef Mclendon R, Friedman A, Bigner D, Meir E, Meir EV, Brat D et al (2008) The Cancer Genome Atlas (TCGA), comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455:1061–1068CrossRef
Zurück zum Zitat Raghav PK, Bhardwaj R, Raghava GPS (2019) Machine learning based identification of stem cell genes involved in stemness. J Cell Sci Ther 3:25–26 Raghav PK, Bhardwaj R, Raghava GPS (2019) Machine learning based identification of stem cell genes involved in stemness. J Cell Sci Ther 3:25–26
Zurück zum Zitat Ran LV, Liu Y, Wang PR (2017) Establishment of HPA gene database for platelet donors in Xingtai area of Hebei. J Clin Transfusion Lab Med Ran LV, Liu Y, Wang PR (2017) Establishment of HPA gene database for platelet donors in Xingtai area of Hebei. J Clin Transfusion Lab Med
Zurück zum Zitat Reiner A, Yekutieli D, Benjamini Y (2003) Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19(3):368–375CrossRefPubMed Reiner A, Yekutieli D, Benjamini Y (2003) Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19(3):368–375CrossRefPubMed
Zurück zum Zitat Schelldorfer J, Meier L, Bühlmann P (2011) GLMMLasso: an algorithm for high-dimensional generalized linear mixed models using l(1)-penalization. J Comput Graph Stat 23(2):460–477CrossRef Schelldorfer J, Meier L, Bühlmann P (2011) GLMMLasso: an algorithm for high-dimensional generalized linear mixed models using l(1)-penalization. J Comput Graph Stat 23(2):460–477CrossRef
Zurück zum Zitat Smyth GK, Ritchie M, Thorne N, et al (2010) Limma: linear models for microarray data. In: Bioinformatics & computational biology solutions using R & bioconductor Smyth GK, Ritchie M, Thorne N, et al (2010) Limma: linear models for microarray data. In: Bioinformatics & computational biology solutions using R & bioconductor
Zurück zum Zitat Tang Z, Li C, Kang B et al (2017) (2017) GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucl Acids Res 1:W98–W102CrossRef Tang Z, Li C, Kang B et al (2017) (2017) GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucl Acids Res 1:W98–W102CrossRef
Zurück zum Zitat Tinteren HV, Hoekstra OS, Smit EF et al (2002) Effectiveness of positron emission tomography in the preoperative assessment of patients with suspected non-small-cell lung cancer: the PLUS multicentre randomised trial. Lancet 359(9315):1388–1392CrossRefPubMed Tinteren HV, Hoekstra OS, Smit EF et al (2002) Effectiveness of positron emission tomography in the preoperative assessment of patients with suspected non-small-cell lung cancer: the PLUS multicentre randomised trial. Lancet 359(9315):1388–1392CrossRefPubMed
Zurück zum Zitat Vijay N, Poelstra JW, Künstner A, Wolf J (2012) Differential expression analysis—edgeR Vijay N, Poelstra JW, Künstner A, Wolf J (2012) Differential expression analysis—edgeR
Zurück zum Zitat Weinberg OK (2005) Aromatase inhibitors in human lung cancer therapy. Can Res 65(24):11287CrossRef Weinberg OK (2005) Aromatase inhibitors in human lung cancer therapy. Can Res 65(24):11287CrossRef
Zurück zum Zitat Weir BA, Woo MS, Getz G, Perner S, Ding L, Beroukhim R et al (2007) Characterizing the cancer genome in lung adenocarcinoma. Nature 450(7171):893–898CrossRefPubMedPubMedCentral Weir BA, Woo MS, Getz G, Perner S, Ding L, Beroukhim R et al (2007) Characterizing the cancer genome in lung adenocarcinoma. Nature 450(7171):893–898CrossRefPubMedPubMedCentral
Zurück zum Zitat Wickham H (2009) Ggplot2: elegant graphics for data analysis. Springer Publishing Company, IncorporatedCrossRef Wickham H (2009) Ggplot2: elegant graphics for data analysis. Springer Publishing Company, IncorporatedCrossRef
Zurück zum Zitat Yin YJ et al (2016) Inhibitory effects of Arhgap6 on cervical carcinoma cells. Tumour Biol 37(2):1411–1425CrossRefPubMed Yin YJ et al (2016) Inhibitory effects of Arhgap6 on cervical carcinoma cells. Tumour Biol 37(2):1411–1425CrossRefPubMed
Zurück zum Zitat Yu G, Wang LG, Han Y et al (2012) clusterProfiler: an R package for comparing biological themes among gene clusters. Omics J Integr Biol 16(5):284–287CrossRef Yu G, Wang LG, Han Y et al (2012) clusterProfiler: an R package for comparing biological themes among gene clusters. Omics J Integr Biol 16(5):284–287CrossRef
Zurück zum Zitat Ywh A, Clt A, Eycb C et al (2021) A risk prediction model of gene signatures in ovarian cancer through bagging of GA-XGBoost models. J Adv Res 30:113–122CrossRef Ywh A, Clt A, Eycb C et al (2021) A risk prediction model of gene signatures in ovarian cancer through bagging of GA-XGBoost models. J Adv Res 30:113–122CrossRef
Metadaten
Titel
Classification prediction of early pulmonary nodes based on weighted gene correlation network analysis and machine learning
verfasst von
Guang Li
Meng Yang
Longke Ran
Fu Jin
Publikationsdatum
26.08.2022
Verlag
Springer Berlin Heidelberg
Erschienen in
Journal of Cancer Research and Clinical Oncology / Ausgabe 7/2023
Print ISSN: 0171-5216
Elektronische ISSN: 1432-1335
DOI
https://doi.org/10.1007/s00432-022-04312-7

Weitere Artikel der Ausgabe 7/2023

Journal of Cancer Research and Clinical Oncology 7/2023 Zur Ausgabe

Mehr Lebenszeit mit Abemaciclib bei fortgeschrittenem Brustkrebs?

24.05.2024 Mammakarzinom Nachrichten

In der MONARCHE-3-Studie lebten Frauen mit fortgeschrittenem Hormonrezeptor-positivem, HER2-negativem Brustkrebs länger, wenn sie zusätzlich zu einem nicht steroidalen Aromatasehemmer mit Abemaciclib behandelt wurden; allerdings verfehlte der numerische Zugewinn die statistische Signifikanz.

ADT zur Radiatio nach Prostatektomie: Wenn, dann wohl länger

24.05.2024 Prostatakarzinom Nachrichten

Welchen Nutzen es trägt, wenn die Strahlentherapie nach radikaler Prostatektomie um eine Androgendeprivation ergänzt wird, hat die RADICALS-HD-Studie untersucht. Nun liegen die Ergebnisse vor. Sie sprechen für länger dauernden Hormonentzug.

Das sind die führenden Symptome junger Darmkrebspatienten

Darmkrebserkrankungen in jüngeren Jahren sind ein zunehmendes Problem, das häufig längere Zeit übersehen wird, gerade weil die Patienten noch nicht alt sind. Welche Anzeichen Ärzte stutzig machen sollten, hat eine Metaanalyse herausgearbeitet.

„Überwältigende“ Evidenz für Tripeltherapie beim metastasierten Prostata-Ca.

22.05.2024 Prostatakarzinom Nachrichten

Patienten mit metastasiertem hormonsensitivem Prostatakarzinom sollten nicht mehr mit einer alleinigen Androgendeprivationstherapie (ADT) behandelt werden, mahnt ein US-Team nach Sichtung der aktuellen Datenlage. Mit einer Tripeltherapie haben die Betroffenen offenbar die besten Überlebenschancen.

Update Onkologie

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert.