Introduction
Breast cancer has become the most common cancer in women, surpassing lung cancer as the leading cause of cancer incidence. The high incidence rate of the disease, with more than 2.3 million new cases each year, continues to be a cause for concern [
1]. Due to the variation in molecular traits, histological features, and clinical outcomes, breast cancer is classified into several subtypes, providing valuable insights into the disease and aiding treatment planning. Breast cancer is typically divided into six subgroups based on their molecular characteristics: basal-like, claudin-low, normal-like, luminal A and B, and HER2-positive. These subgroups have unique molecular profiles that distinguish their characteristics. The basal-like and claudin-low subtypes of triple-negative breast cancer (TNBC) lack expression of estrogen receptor (ER), progesterone receptor (PR), and HER2. These subtypes are associated with a higher risk of disease relapse and a greater likelihood of developing visceral metastases [
2].
Biomarkers play a crucial role in identifying and predicting outcomes as well as therapeutic approaches for breast cancer. However, some commonly used biomarkers, including carcinoembryonic antigen (CEA), CA 15-3, and CA 27–29, have insufficient sensitivity and specificity, making them unsuitable for detecting breast cancer. They are recommended for monitoring disease progression and evaluating treatment response, particularly in patients with metastatic breast cancer [
3]. On the other hand, biomarkers such as ER, PR, and HER2 have been extensively used in the management of breast cancer. They provide valuable information for prognosis and serve as targets for targeted therapy and hormone therapy [
4]. In the pursuit of advancing breast cancer diagnosis and treatment, it is crucial for researchers to gain a comprehensive understanding of the molecular pathways that underlie breast carcinogenesis. Despite years of dedicated research into breast cancer patients, the overall 5-year survival rate remains unsatisfactory [
5]. Consequently, there is a significant need for the discovery of reliable and novel biomarkers to aid in the early detection of breast cancer, enhance prognostic accuracy, enable precise prediction of disease behavior, and facilitate the development of targeted therapeutic approaches.
High throughput gene expression technologies provide comprehensive genetic information on cancer samples and identify changes in disease progression [
6‐
8]. High throughput data like genomics, epigenomics, and transcriptomics in online databases were mined to identify potentially novel cancer-associated biomarkers. Recently, machine learning models such as support vector machine (SVM) and random forest have become attractive strategies for obtaining gene signatures.
The study identified new genes associated with breast cancer using large-scale transcriptomics data and the random forest technique. The expression of these genes in breast cancer tissues was validated using qRT-PCR and compared to normal tissues.
Discussion
There is an urgent need to characterize new biomarkers that can facilitate early detection of breast cancer and overcome the limitations of mammography and the challenges of current tumor biomarkers such as CA-125 and CEA [
22,
23].
In this study, RNA expression data was obtained from TCGA to identify DEGs between BRCA and normal samples. The up-regulated genes were then analyzed using the random forest algorithm to identify the most important genes. These key genes were further investigated based on their overexpression in breast cancer tissues, low median expression in normal female tissues, and potential as novel diagnostic biomarkers. Four genes were identified from this screening: CACNG4, PKMYT1, EPYC, and CHRNA6. Integrated online bioinformatics databases were used to gain insight into the diagnostic, prognostic, and therapeutic roles of these identified potential biomarkers. Analysis using the UCSC Xena tool confirmed higher expression of these four genes in breast cancer tissues than in normal tissues. In vitro quantification in breast tumor tissues further confirmed the overexpression of these novel identified BRCA biomarkers. The association of these genes with various clinico-pathological parameters in breast cancer patients suggests that these identified genes could be used as potential therapeutic biomarkers in breast cancer patients. Pathway analysis conducted using the biological pathway revealed the involvement of these identified genes in the regulation of key cellular processes, including cell growth, which is critical for cancer development and progression. In addition, analysis of the COSMIC and cBioPortal databases showed that aberrant expression of these novel genes in breast cancer is associated with mutations and genetic alterations. These findings provide valuable insights for researchers investigating the molecular mechanisms of breast cancer. They also provide clinicians with potential targets that could improve diagnostic accuracy and contribute to the development of more effective treatment strategies.
The identification of CACNG
4 as a potential breast cancer biomarker is an important step towards improving clinical outcomes for breast cancer patients. As a transmembrane type I, AMPA receptor regulatory protein,
CACNG4 plays a critical role in regulating both channel gating and trafficking of AMPA receptors [
24].
Amplification of
CACNG4 has been shown to contribute to increased breast cancer cell motility, transformation, and metastasis, highlighting the importance of targeted therapies that can disrupt its actions [
25]. As CACNG4 is located on the plasma membrane, antibody-based therapies have the potential to inhibit its function and impede breast cancer progression, providing a viable and valuable approach for the development of novel treatment strategies. Additionally, our findings in the biological pathway analysis revealed that
CACNG4 is involved in the ErbB receptor signaling network and the mTOR signaling pathway, both of which have been implicated in cancer metastasis and poor prognosis based on studies by Drago et al. and Tian et al. [
26,
27]. It is worth noting that the molecular function of
CACNG4 is voltage-gated calcium channel activity. Studies have shown that calcium channel antagonists have anti-proliferative effects on various cell types, including vascular, retinal pigment, and prostate cancer cells. Therefore, targeting Ca
2+ pumps or channels has been suggested as a potential therapeutic approach for the treatment of breast cancer [
28].
The protein kinase PKMYT1 (Membrane Associated Tyrosine/Threonine 1), a member of the WEE kinase family, has been shown to play a negative role in the G2/M phase of the cell cycle and has been implicated in the development and progression of several cancers, including hepatic, glioblastoma, colorectal, and non-small cell lung cancers [
29]. Overexpression of
PKMYT1 in these cancers is typically associated with poor prognosis and disease progression [
30]. Based on Kaplan-Meier plotter database analysis,
PKMYT1 overexpression is also associated with a poor prognosis in breast cancer patients. Liu et al. also reported that
PKMYT1 overexpression had been linked to poor prognosis, suggesting that it may be an appealing therapeutic target for breast carcinoma [
29]. A study by Zhang et al. demonstrated that
PKMYT1 upregulation promotes tumor progression and correlates with poorer overall survival in patients with esophageal squamous cell carcinoma (ESCC) [
31]. In this study, FunRich tool analysis revealed that the biological pathway for co-expressed genes with the
PKMYT1 gene is cell cycle and DNA replication, indicating that overexpression of this gene could develop breast cancer tumorigenesis. This protein upregulation is crucial for the development of some cancers, such as glioblastoma, colon cancer, and hepatic carcinoma [
32], and promotes gastric cancer (GC) cell proliferation and apoptosis resistance [
33]. This may be due to the effects of
PKMYT1 on enhancing the AKT/mTOR signaling pathway in promoting carcinogenesis and the progression of cancer cells through other pathways, such as activation of Notch signaling [
34]. Based on the Cancer Dependency Map analysis tool, lower dependency scores correspond to a higher likelihood that the gene is essential for cell survival or growth. PKMYT1 has been identified as critical for breast cancer cell line survival, suggesting its potential as a viable strategy for therapeutic intervention in breast cancer patients. In another study,
PKMYT1 was identified as a promising target to enhance the radio sensitivity of lung adenocarcinoma (LUAD). This finding suggests that targeting
PKMYT1 could potentially be an attractive target for anticancer therapy [
35].
Epiphycan (
EPYC) is a member of the small leucine-rich repeat proteoglycan family. Epiphycan, also known as dermatan sulfate proteoglycan 3, interacts with collagen fibrils and other extracellular matrix proteins and regulates fibrillogenesis. It has been suggested that
EPYC is involved in bone formation, maintaining joint integrity, and establishing the organized structure of cartilage through matrix organization [
36]. EPYC protein is secreted into the extracellular matrix based on the GenCards database analysis. Studies have shown that insufficient expression of
EPYC can lead to corneal dystrophy and hearing loss [
37]. However, there have been very few studies on the role of
EPYC in cancer. The FunRich analysis tool revealed that genes co-expressed with
EPYC are mainly involved in epithelial-mesenchymal transition (EMT), a process in which breast cancer cells acquire mobility, leading to progression and metastasis [
38]. A study by Deng et al. investigated the effects of
EPYC overexpression on the proliferation, invasion, and metastasis of ovarian cancer cells [
36]. In the current study,
EPYC was found to be positively co-expressed with COL11A1 and MMP13. Overexpression of COL11A1 is often associated with an aggressive tumor phenotype and a poor prognosis in many solid tumor types, including pancreatic, breast, ovarian, and colorectal cancers [
39]. MMP-13 may be vital for the invasion and metastasis of breast cancer cells [
40] and may be helpful as a prognostic marker when assessed simultaneously with lymph node status and HER2 expression [
41]. Additionally, Spearman correlation analysis revealed a significant positive correlation between the mRNA expression level of
CACNG4 and
EPYC, indicating a strong association between the expression of these two genes.
The
CHRNA6 gene encodes an alpha subunit of neuronal nicotinic acetylcholine receptors, which function as ion channels and play a crucial role in neurotransmission in the nervous system. This protein is activated by acetylcholine and exogenous nicotine and mediates dopaminergic neurotransmission. In this study, the CHRNA6 protein is predicted to be expressed on the plasma membrane based on the GenCards database, indicating that antibody-targeted therapy could be helpful. However, there is currently no in silico or experimental study on the effects of the
CHRNA6 gene on cell proliferation and tumor progression, and it could be a potential novel biomarker in cancer studies. This study investigates for the first time the mRNA expression of
CHRNA6 in breast tumor tissues. The FunRich analysis tool revealed that
CHRNA6 interacts with other molecules and is involved in the ErbB receptor signaling network and signal transduction. This pathway plays a crucial role in regulating cell growth and differentiation, and its dysregulation has been implicated in various cancers [
27]. The clinico-pathological databases analysis showed that HER2 upregulation and PR downregulation were associated with high
CHRNA6 expression, and BRCA1/2 mutation was associated with low
CHRNA6 expression, suggesting that
CHRNA6 may be a potential diagnostic biomarker in breast cancer. The co-expression of
CHRNA6 with TLR7 and OLR1 was investigated and confirmed. Survival analysis showed that TLR7 expression had a significant impact on survival [
42]. OLR1 overexpression revealed a poor prognosis in breast cancer and might represent a potential therapeutic target for breast cancer patients [
43].
This retrospective study has some limitations. First, although new breast cancer-associated biomarkers are predicted, their mechanism of action remains unclear. Second, the results need to be validated by a larger sample size and more experimental studies. Therefore, additional prospective clinical and large-scale studies are needed to validate these results.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.