Introduction
Prostate cancer (PCa), a common malignant tumor, is the second leading cause of cancer-related mortality in men worldwide [
1]. It can metastasize to bone (80%-100%), lymph nodes, liver, adrenal gland, or lung [
2]. Although most early localized prostate cancer can achieve satisfactory results by prostatectomy or radiotherapy with a 5-year survival rate of 98.9%, metastasis is mainly found on initial diagnosis, hampering the avenue to a good prognosis. Therapy for metastatic PCa remains limited, and the current standard therapy is androgen deprivation therapy (ADT) combined with chemotherapy [
3,
4]. Although ADT is initially effective, most patients inevitably develop into lethal metastatic castration-resistant prostate cancer (mCRPC) within 2–3 years [
5], and the 5-year survival rate of them is only 28.2% [
1]. Accordingly, there is significant enthusiasm to improve the stratification of patients with prostate cancer so that high-risk patients can be found earlier and receive active treatment.
The homologous recombination pathway plays a vital role in DNA repair and involves many genes [
6], including
BRCA (
BRCA1/2),
ATM,
CHEK2, etc. Accumulated evidence has revealed the value of homologous recombination deficiency (HRD) in PCa, representing a high risk of PCa carcinogenesis and aggressiveness. A quarter of patients with recurrent or advanced PCa carry germline or somatic mutations in HRD-related genes [
7]. The most commonly altered HRD-related gene in prostate cancer is
BRCA2, with a prevalence of 5–6% at the germline level in mCRPC patients [
8,
9]. A previous study revealed that
BRCA2 mutation carriers have a 5.0 to 8.6-fold increased risk and a 15% absolute risk of developing PCa [
10,
11]. Moreover,
BRCA2 mutation carriers have higher progression rates from local to systemic disease, higher Gleason scores, shorter metastasis-free survival, and lower overall survival rates when compared to non-carriers [
12‐
14]. In general, HRD is closely associated with a worse prognosis in PCa.
By extracting the HRD scores and other information from The Cancer Genome Atlas prostate adenocarcinoma cohort (TCGA-PRAD), we established an HRD signature to distinguish between high-risk and low-risk PCa patients. Through in-depth analysis, we identified and validated the protective effect of Solute Carrier Family 26 Member 4 (SLC26A4) in PCa, which may guide the application of poly(ADP-ribose) polymerase (PARP) inhibitors in PCa complementary to the commonly HRD-related gene mutations.
Material and methods
Prostate cancer datasets and preprocessing
Three open datasets with prostate cancer samples, multi-omics data, and complete clinical information were retrieved from the Cancer Genome Atlas (TCGA), Memorial Sloan Kettering Cancer Center (MSKCC), and Gene-Expression Omnibus (GEO) databases on August 22, 2021, including TCGA-PRAD [
15], MSKCC-PRAD [
16], and GSE116918 [
17] cohorts. Then fragments per kilobase of exon model per million mapped fragments (FPKM) values were transformed into transcripts per kilobase million (TPM) values and log-transformed. The HRD, including loss of heterozygosity (LOH), telomeric allelic imbalance (TAI), and large-scale state transitions (LST), as well as gene-level copy numbers, PARADIGM integrated pathways, immune subtypes, gene-level non-silent mutation, were downloaded from Pan-Cancer (PANCAN) cohort in UCSC Xena (
https://xenabrowser.net/) [
18,
19]. Patients in the TCGA-PRAD cohorts without specific HRD scores were excluded for further analysis.
The HRD scores and genome-wide DNA damage footprints were updated on June 13, 2017. Since then, patients in the TCGA-PRAD cohorts without specific HRD scores were excluded for further analysis. We quartered patients in the TCGA-PRAD cohort according to the HRD scores. Quarters 1 and 4 were defined as the bottom HRD group and top HRD group, respectively. Differential analysis was performed based on the transcriptomic data of the two groups using the “limma” R package. Genes with | log2(fold change) |> 0.5 and p value < 0.05 were selected for subsequent univariate Cox analysis, and those significantly correlating with patient progression-free interval (PFI or PFS) were defined as HRD-related genes. Their mutational and expressional profiles were investigated. We also calculated their Spearman’s correlations based on their mRNA expression levels and displayed it as an intra-correlation plot.
Unsupervised clustering for HRD-related genes
Unsupervised clustering analysis was applied to identify distinct HRD patterns based on the expression of the above prognostic HRD-related genes and classify patients for further analysis. The consensus clustering algorithm determined the number of clusters and their stability. We used the ConsensuClusterPlus package to perform the above steps, and 1000 repetitions were conducted to guarantee the stability of classification [
20].
The mRNA expression level of each HRD-related gene was depicted among the clusters. Principal Component Analysis (PCA) and Kaplan–Meier survival analysis were performed to assess the power of clustering. The distributions of clinicopathological characteristics, including age at diagnosis, Gleason score, primary outcome, biochemical recurrence (BCR), pathologic T stage, pathologic N stage, original zone of cancer, and immune subtype, were evaluated across the clusters.
Pathway quantification at transcriptomic and proteomic levels
The PARADIGM algorithm integrates pathway, expression, and copy number data to infer activation of pathway features within a superimposed pathway (SuperPathway) network structure. The SuperPathway system comprises 1387 constituent pathways from three pathway databases, NCI-PID, BioCarta, and Reactome (last updated 05/2013), containing 19K pathway features, representing 7369 genes, 9354 complexes, 2092 families, 82 RNAs, 15 miRNAs, and 592 abstract processes. This dataset is ssGSEA scores for 1387 constituent pathways [
19].
Reverse-phase protein array (RPPA) data from the PANCAN cohort were used to calculate the pathway activity score of 10 cancer-related pathways. RPPA is a high-throughput antibody-based technique with procedures like Western blots. Proteins are extracted from tumor tissue or cultured cells, denatured by SDS, printed on nitrocellulose-coated slides, followed by an antibody probe. The terms included Apoptosis, Cell Cycle, DNA Damage Response, Epithelial-Mesenchymal Transition (EMT), Hormone a, Hormone b, PI3K/AKT, RTK, and TSC/mTOR pathways. In brief, RBN RPPA data were median-centered and normalized by the standard deviation across all samples for each component to obtain the relative protein level. The pathway activity score is then the sum of the relative protein level of all positive regulatory elements minus that of negative regulatory components in a particular pathway [
21].
Estimation of tumor purity and fractions of immune cells
Estimation of stromal and immune components and tumor purity in tumor tissues using expression data was achieved by the “ESTIMATE” R package [
22]. Subsequently, the population abundance (fraction) of tissue-infiltrating immune and stromal cell populations was estimated by three well-known algorithms, including MCP counter (10 cell types) [
23], ImmuneCellAI (24 cell types) [
24], and Cibersort (22 cell types) [
25].
Essential molecular characteristics of the tumor
We extracted vital molecular features of malignant tumors from an integrated and in-depth bioinformatics study [
26], including proliferation, leukocyte fraction, B cell receptor (BCR) evenness, T cell receptor (TCR) evenness, Th1, Th2, and Th17 cells, aneuploidy score, intratumor heterogeneity (ITH), single nucleotide variant (SNV) neoantigens, insertion-and-deletion (indel) neoantigens, cancer-testis antigen (CTA) score, homologous recombination defects, and fraction of altered genome. The microsatellite instability (MSI) MANTIS score was downloaded from cBioPortol for Cancer Genomics (
https://www.cbioportal.org/).
Immunomodulator identification and analysis
A list of 78 immunomodulatory genes was obtained from a previous study that curated them from a literature review performed by immuno-oncology experts within the TCGA immune response working group [
26]. Corresponding median mRNA expression levels were used to summarize expression in each cluster. We performed a limma differential analysis across clusters to examine differences in immunomodulatory gene expression and found genes to be significantly differentially expressed. And the immunomodulatory copy number was also outputted from a PANCAN cohort as deep amplifications (2), shallow amplifications (1), non-alterations (0), shallow deletions (− 1), and deep deletions (− 2) of each immunomodulator gene. Proportions of samples with each type of copy number alteration were then compared across HRD clusters.
Profiling of prognostic hub genes and dimensionality reduction
We performed differential expression analysis between pairs in this cohort of HRD clusters and performed Cox survival analysis after taking the intersection of the resulting differentially expressed genes. Those with survival significance were set as prognostic hub genes, whose expression patterns were employed as the basis of subsequent PCA analysis. The risk signature was termed as ‘HRDscore’ and calculated by the following formula:
$$HRDscore=\sum \left[\left({PC}_{1}+{PC}_{2}\right)\times {expression}_{risk}-\left({PC}_{1}+{PC}_{2}\right)\times {expression}_{protective}\right]$$
where “
\({expression}_{risk}\)” stood for expression levels of risk genes and “
\({expression}_{protective}\)” stood for that of protective genes.
Patients were dichotomized into high HRDscore, and low HRDscore groups based on the best cut-off decided by X-tile software. A Sankey plot was established to investigate the intrinsic relationship among HRD cluster, immune subtype, and HRDscore. Furthermore, we explored the correlations between the HRDscore and clinicopathological features, including survival. For subgroup analysis, TCGA-PRAD patients were divided into different groups based on features as follows: age (≤ 45 years old or > 45 years old) and Gleason score (< 8 or ≥ 8). Finally, multivariate Cox analyses were conducted to test the robustness of the established HRDscore.
Prediction of immunotherapy response and correlation with immune cells
ImmuCellAI was used to predict the response of immune checkpoint blockade (ICB) therapy based on the transcriptomic data [
24]. A receiver operating characteristic (ROC) curve was built to illustrate the power of HRDscore in predicting immunotherapy response.
We calculated the correlations between HRDscore and fractions of immune cells and the prognostic value of these cell types. Next, several genes were obtained after the intersection between HRD-related and prognostic hub genes. Their relationship to immune cells was also measured to find critical genes that bridge HRD scores, immune infiltration, and patient prognosis.
Quantitative real-time PCR assay
Quantitative real-time PCR was performed with SYBR Green PCR mixture (Using Roche lightcycler 480 system) according to standard protocols. PCR conditions were: one cycle of 5 min at 95 °C, then 45 cycles of 10 s at 95 °C, 10 s at 60 °C, 10 s at 72 °C. The expression of the SLC26A4 gene was normalized to the expression of the GAPDH gene using the comparative CT method. Primers used were: SLC26A4 (F: 5′-AGGAAATATGCACTGCTCACT- 3′; R: 5′-AGTATTCCCGCAGTTTGCTGA-3′); GAPDH (F: 5′-CAAGGCTGAGAACGGGAAG-3′; R: 5′-TGAAGACGCCAGTGGACTC-3′).
Prostate cancer samples and immunohistochemistry
Prostate cancer samples were acquired from Xiangya Hospital of Central South University. A physician obtained informed consent from the patients. The procedures related to human subjects were approved by the Ethics Committee of Xiangya Hospital, Central South University. Tissues were fixed in 10% buffered Formalin, then transferred to 70% alcohol. These paraffin-embedded tissues were sectioned (4 μm) and stained with antibodies against SLC26A4 (HPA042860, Atlas Antibodies). The following detection and visualization procedures were performed according to the manufacturer's protocol. To quantify the immunohistochemistry (IHC) result of positive staining, five random areas in each tissue sample were microscopically examined and analyzed by an experienced pathologist. The average staining score was calculated by dividing the positive areas by entire regions.
Statistical analyses
The univariate and multivariate Cox analyses were performed to detect the prognostic factors. Kaplan–Meier curves with the log-rank test were used to assess survival differences between groups. Spearman correlation analyses were used to calculate correlations. The cutoff value was determined using the X-tile software (version 3.6.1). All statistical analyses were conducted using R software (version 4.1.2), and most visualization was achieved using the “ggplot2” R package. P < 0.05 was considered statistically significant.
Discussion
For many years, people have been exploring the initiation, development, and treatment of PCa. Gleason score and serum PSA level are still the most important prognostic factors of PCa. Recently, increasing evidence has suggested that HRD plays a key role in the biological process and therapeutic response in various tumors, one of the most influential factors for the prognosis of tumors [
27]. In addition, many studies have found that mCRPC patients with HRD-related gene mutations show impressive responses to PARP inhibitors, even in very advanced disease settings [
28‐
32]. The common HRD-related gene mutations in PCa are
BRCA2,
ATM, and
CHEK2, all of which are included in the molecular eligibility criteria of virtually all PARP inhibitor trials involving mCRPC patients [
33]. However, their germline mutations were found in 5–6%, 1–2% and 1–2% of mCRPC patients, respectively [
8,
9].Therefore, new biomarkers need to be developed for molecular typing of PCa patients. In this study, we deeply analyzed the molecular characteristics of PCa patients with different HRD scores and identified a biomarker that could be complementary to the HRD scores.
HRD score integrates three indicators focusing on DNA-based genomic instability, which has been less explored in prostate cancer. The previous study has found that patients with primary prostate cancer have lower HRD scores, while patients with germline
BRCA2 mutations have higher HRD scores [
34]. Since
BRCA2 mutation is the indication of two PARP inhibitors recently approved by the Food and Drug Administration (FDA) for the treatment of PCa, HRD score analysis may help improve treatment options. In this study, by analyzing the HRD scores of the TCGA-PRAD cohort, we obtained 23 genes associated with HRD scores, defined as HRD-related genes. We identified three molecular patterns with distinct clinicopathological characteristics based on these genes, and HRD cluster 1 was particularly correlated with worse clinicopathological types and poor prognosis.
The HRD clusters demonstrated distinct immune landscapes. In general, T cells, CD8 T cells, cytotoxic lymphocytes, and natural killer (NK) cells were less enriched in HRD cluster 1 indicating that the inhibited immune response may explain the poor outcome of patients in cluster 1. The HRD clusters also showed different immunogenomic characteristics. Specifically, HRD cluster 1 harbored the highest mutational burden, highest proliferation potentials, and lowest genomic stability, indicating an absolute potential to derive mutation and subsequent carcinogenesis. Besides, HRD cluster 1 demonstrated the lowest leukocyte abundance but the highest M2 macrophage infiltration. Macrophage infiltration in solid tumors is associated with poor prognosis [
35]. The previous study has found that macrophages infiltrating PCa were mainly M2 type and associated with invasiveness and unfavorable outcome. We also noticed that Th1 to Th2 ratio was lowest in cluster 1. Cellular immunity mediated by Th1 mainly plays an anti-tumor role. Once it shifts from Th1 to Th2, resulting in immunosuppression. Thus, the anti-tumor immunity of the body will be seriously disturbed. Yamamura et al. and Kharkevitch et al. first found that Th2 cells were dominant in tumor patients [
36,
37], and then found that Th2 shift occurred in many types of tumors such as non-small cell lung cancer, choriocarcinoma, glioma, gastric cancer, ovarian cancer, melanoma, colorectal cancer, and lymphoma. The above results define an immunosuppressive microenvironment phenotype of prostate cancer and an unstable genomic condition in HRD cluster 1.
Given the importance of IMs in cancer immunotherapy, we compared the differences in IM gene expression between these three clusters. The genes with the most obvious difference among clusters. In general, most of the IMs were in a relatively low expression level in HRD cluster 1 than those in clusters 2 and 3, suggesting that immune responses regulated by membrane checkpoints were less common there. Consistently, copy number variations of IMs were more frequent in HRD cluster 1 in amplification and deletion, confirming the unstable genomic phenotype. Although such a trend was not evident in SNV, several untypical checkpoints still had higher variation frequencies like
GZMA,
PRF1,
ENTPD1, and
ARG1. Paradoxically, TNFSF4 was significantly up-regulated in HRD cluster 1. Recent studies have shown that stimulation of OX40, the ligand of TNFSF4, is helpful for therapeutic immunization strategies for cancer [
38]. It has been found that TNFSF4 is enriched in bone metastatic PCa [
39]. Combined with our results, it may serve as a new therapeutic target in PCa, especially for those patients with high expression.
Furthermore, we established a signature (termed HRDscore) with excellent power to predict prognosis with stability. Based on proteomic data, the HRDscore was tightly correlated with existing signatures related to genomic instability, including homologous recombination, DNA damage repair, and Fanconi anemia (correlation coefficient ≥ 0.5, p < 0.001). This result suggested that the HRD-derived risk system could represent the signature of genomic defects. Besides, the HRDscore was positively related to macrophages, the unfavorable cell type, which was consistent with the above suppose. A recent article has explored HRD scores in PCa, which focused on the correlation between HRD scores and mutations of
BRCA2 and
ATM [
34]. However, these mutations are not common in PCa, especially in non-mCRPC, so the HRD score is of little value to numerous PCa patients without these mutations. In comparison, our HRDscore has excellent value for predicting the prognosis and even guiding treatment in PCa.
To further explore valuable biomarkers, we finally focused on a single gene, SLC26A4, correlated with immune infiltration and clinical diagnosis. It showed protective effects in several independent PRAD cohorts (RR 0.39, 95% CI 0.29–0.93,
I2 = 0). Functional single-cell analysis suggested that SLC26A4 was negatively correlated with "DNA damage" and "invasion" functions. Nevertheless, the lack of prostate cancer single-cell cohort with SLC26A4 expression data hampered our understanding of its functions in PCa. Previous studies have mostly believed that SLC26A4 plays a vital role in maintaining normal hearing and never explored its significance in malignancies [
40]. Our study revealed its potential value in tumorigenesis and development for the first time, which is worthy of in-depth exploration in future research.
SLC26A4 encodes a membrane protein called pendrin that permits the anion exchange between the cytosol and extracellular space, maintaining the proper function of auditory sensory cells. It is mainly expressed in the inner ear and thyroid gland, and its mutation is related to dyshormonogenic goiter and Pendred syndrome [
41,
42]. Hypermethylation of SLC26A4 often occurs in cancers such as thyroid cancer and acute myoid leukemia [
43,
44], consistent with our results. All the above findings indicated that the epigenetic changes of SLC26A4 may be involved in tumorigenesis. Our study uniquely found that SLC26A4 was highly associated with HRD in prostate cancer.
In our own Xiangya cohort, the SLC26A4 expression in PCa samples was lower than that in benign prostatic hyperplasia tissues at both mRNA and protein levels, which was inconsistent with the results of the TCGA-PRAD cohort. This may be due to the insufficient sample size of our cohort. Therefore, it needs to be further confirmed. Importantly, we found that SLC26A4 performed well in predicting HRD in patients with PCa. Patients with HRD-related gene mutations are often sensitive to PARP inhibitors, so we proposed that SLC26A4 may be a novel biomarker to screen patients sensitive to PARP inhibitors.
Consequently, we herein provided a potential biomarker for the treatment of PCa with PARP inhibitors. However, several limitations should be addressed in our study. First, there is a lack of SLC26A4 expression data in the prostate cancer single-cell cohort, which has been mentioned above. Secondly, our analyses were also limited by the relatively small sample size. Finally, due to the lack of prognostic and treatment information in the cohort, we failed to thoroughly verify the value of SLC26A4 in suggesting prognosis and guiding treatment. Therefore, further validation based on a large cohort is warranted.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.