Background
Cervical cancer is the fourth most common cancer in females, with 604,127 new cases and 341,831 deaths estimated for 2020 worldwide [
1]. Squamous cell carcinoma (SCC) is the predominant histological type of cervical cancer, with adenocarcinoma (AC) occurring less frequently [
2]. Persistent high-risk human papillomavirus (HR-HPV) infection is associated with the development of cervical intraepithelial neoplasia (CIN), if untreated, which may progress to SCC over a period of 15 to 20 years [
3]. Currently, a two-tier system of low- and high-grade squamous intraepithelial lesions (LSIL and HSIL) paralleling the terminology of the Bethesda System cytologic reports was recommended to replace the old CIN classification by World Health Organization (WHO) [
4].
Cervical carcinogenesis is a complex process occurring as a consequence of multiple genomic alterations. Several expression microarray studies have been conducted investigating transcriptome changes in this process. Some research focused on specific dysregulated genes mediating the invasion of cervical cancer cells [
5,
6]. Other research was designed to identify molecular changes that drive cervical cancer development [
7‐
9]. Of note, studies based on next-generation sequencing are rare, probably due to ethical reasons and difficulties in obtaining tissue samples. Driven by the need for a comprehensive molecular characterization of the carcinogenic process, we performed a meta-analysis on publicly available gene expression profiles for an in-depth study.
This study is also motivated by the clinical desire to develop novel biomarkers of cervical carcinogenesis. On the diagnostic front, early detection of HSIL and subsequent surgical intervention are necessary to prevent further progression [
10]. However, the inter- and intra-observer reproducibility of SIL grade evaluation is often poor among different pathologists due to mimics of HSIL (e.g., atrophy, LSIL, and therapy changes) [
11‐
13]. In routine pathology practice, p16
INK4A and Ki-67 are the most commonly used biomarkers of HR-HPV infection and cell proliferation, respectively. It has been demonstrated that p16
INK4a can distinguish HSIL from its mimics and improve the diagnostic consistency of precancerous lesions among pathologists [
14,
15]. Nonetheless, p16
INK4a has a certain positive rate in normal cervical tissue, cervicitis, and LSIL, which limits specificity for detecting HSIL [
16‐
18]. On the prognostic front, although the incidence and mortality of cervical cancer are decreasing due to increased global vaccination and screening coverage, clinical outcomes of patients with advanced-stage or recurrence disease are still bleak and difficult to predict [
19]. Driven by the need for effective biomarkers to improve the diagnosis of HSIL and the prognosis of SCC, we specifically focused on screening persistently altered genes involved in carcinogenesis.
Discussion
This meta-analysis based on previous studies comprehensively characterizes the transcriptomic profiles of cervical carcinogenesis and identifies four key genes (AURKA, TOP2A, RFC4, CEP55) associated with the initiation and progression of SCC. Then, we carefully assess their diagnostic performance in HSIL/HSIL+ and prognostic performance in SCC. To the best of our knowledge, our study is the first to evaluate and validate the diagnostic and prognostic value of RFC4 in cervical lesions.
We found that the transcriptomes of normal epithelium and SILs were homogenous. However, the increased heterogeneity was observed upon progression to SCC. A study of the transcriptomic landscape of hepatocarcinogenesis presented homogeneity in dysplastic lesions and early carcinoma but heterogeneity in advanced liver cancer, somewhat similar to our results [
42]. Due to the lack of FIGO staging and histological grading data, whether there is heterogeneity between early SCC and preinvasive or late SCC was unknown in our discovery datasets. In the PCA of GSE63514, HSIL was partially overlapped with normal/LSIL and SCC. Moreover, HSIL showed higher heterogeneity than normal in GSE27678. Akin to genetic alteration, we believe that some dysregulated genes common to SCC but only changed in a part of HSIL contributed to the potential heterogeneity of HSIL [
43,
44].
Compared to enrichment with total DEGs, separate enrichment with up- and downregulated genes could detect more pathways associated with the phenotypic difference [
45]. We used both strategies in this study. Although separate analysis consistently detected more terms and pathways, some pathways (e.g., Wnt signaling pathway in LN_DN and CN_DEG) enriched in different disease stages by two strategies respectively should not be ignored. Cell cycle, DNA repair, and oncogenic p53 pathways were activated in HSIL and SCC. The close association between these pathways and HPV has been evidenced. HR-HPV E6 and E7 oncoproteins interfere with p53 and pRB, leading to cell cycle disturbances and promoting DNA damage response (DDR) that has a known central role in cervical carcinogenesis [
3,
46,
47]. Furthermore, we found inhibition of TGF-β and Hippo signaling pathways in LSIL, consistent with their tumor-suppressive properties in the early stage of carcinogenesis [
48,
49]. Interestingly, the HTLV-1 infection and IL-17 signaling pathways were enriched in all disease stages. The deregulation of cell cycle is a common feature in cancer cells and HTLV-1-infected cells, which is why we believe that the HTLV-1 infection pathway was enriched in HN and CN [
50,
51]. The HTLV-1 Tax oncoprotein interacts with SRF to activate the transcription of immediate early genes (FOS, FOSL1, EGR1, and EGR2) [
52,
53]. However, these genes were downregulated in LN. The association between HPV and SRF in early stage of cervical carcinogenesis might be worth investigating. Of IL-17 cytokine genes, IL17C showed significantly lower expression in HSIL and SCC when compared to normal control. IL17C is an epithelial cell-derived cytokine that regulates innate epithelial immune responses [
54], and its response to HPV infection has not been explicitly investigated. A previous study had reported that increased Th17 cells were associated with progression of SCC [
41], which was inconsistent with our results. However, there were studies reporting that lymph nodes of premalignant lesion-bearing mouse contained more Th17 cells than HNSCC-bearing mouse lymph nodes [
55,
56]. Reduced IL23 production and increased TGF-β production by HNSCC may lead to the decrease in Th17 by redirecting the immune phenotype toward Treg [
56].
We found four hub genes through network analysis. Using IHC, the gradually increasing expression of hub genes along with the severity of lesions was validated. Notably, CEP55 was initially reported to be associated with the course of cervical lesions. The staining pattern of TOP2A was similar to that of Ki-67, and concordance between them was substantial. While a study comparing ProExC and Ki-67 expression in 197 cervical biopsies reported that 35% of cases showed discordant staining [
57]. We then compared the diagnostic performance of p16
INK4a, Ki-67, TOP2A, and RFC4 alone or in combination to detect HSIL/HSIL+. Among the four markers, p16
INK4a routinely used in clinical practice showed the highest sensitivity but moderate specificity. Similar to previous reports, the combination of p16
INK4a and Ki-67 in serial interpretation could improve specificity and accuracy for detecting HSIL [
58]. RFC4 and TOP2A alone provided similar diagnostic performance to the combination of p16
INK4a and Ki-67. Parallel interpretation of TOP2A and RFC4 produced the highest AUC, and parallel interpretation of Ki-67 and RFC4 produced the highest sensitivity and NPV for detecting HSIL. Importantly, RFC4 and TOP2A have additional advantages. The expression of RFC4 from 3q26 exhibited a high correlation with copy number gain, and 3q gain as a potential marker in the diagnosis of HSIL is frequently found in cervical cancer and its precancerous lesions [
59,
60]. For TOP2A, its exclusive and clear nuclear staining is an advantage over nuclear and cytoplasmic staining of RFC4 and p16
INK4a. Moreover, Shi et al. reported that TOP2A is more sensitive and specific than ProEXC for detecting HSIL [
61]. Considering cost-effectiveness, a single biomarker with balanced sensitivity, specificity, and high accuracy is recommended. When meeting patients with suspected HSIL, we can choose parallel interpretation of Ki67/TOP2A and RFC4 with high sensitivity and NPV to safely exclude lesions.
Furthermore, we explored the clinical and prognostic significance of identified genes in SCC. Compared with a continuous increase of hub gene expression in normal to SILs to SCC transitions, only AURKA mRNA expression significantly increased with advancing FIGO stage, increasing tumor differentiation and aggressiveness in SCC, as indicated by the poor OS. This is consistent with the findings observed previously [
62], though the prognostic interest of AURKA could not be validated in the TJH cohort. A previous study demonstrated that high CEP55 protein expression correlates with better OS and recurrence-free survival (RFS) in SCC [
63]. We found this trend in our research but not statistically significant. Several studies based on TCGA-CESC data have reported the relationship between TOP2A and RFC4 mRNA expression and the prognosis of cervical cancer [
64,
65]. However, there was no other evidence to support their relationship, let alone the confirmation at the protein level. Here, we demonstrate for the first time that increased RFC4 and TOP2A protein expression correlates with a favorable outcome in patients with SCC, and RFC4 is an independent prognostic marker for SCC. Furthermore, preliminary investigations have also demonstrated the role of RFC4 in predicting the outcome of other neoplasia, such as non-small cell lung carcinoma, colorectal cancer, and breast tumor [
66‐
68].
Of course, our research has some limitations. Firstly, diagnosis error cannot be excluded entirely because the histopathologic diagnosis of CIN is subject to substantial rates of discordance among pathologists. Due to the majority diagnosis from three expert gynecologic pathologists and the large sample size in our study, we considered this diagnosis bias only to influence results to a minor degree. Secondly, we focus on RFC4 dynamic expression and clinical application here, which could not clarify the cause-and-effect relationship between RFC4 overexpression and disease progression. Our laboratory has ongoing experimental studies of RFC4 in papillomavirus oncogenic cell transformation.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.