Background
Renal cell carcinoma (RCC) is the most common cancer in adult kidney corresponding to nearly 3% of all adult malignancies worldwide [
1], being an important cause of cancer morbidity and mortality [
1]. Clear cell renal cell carcinoma (ccRCC) subtype is the most prevalent [
2], making it especially important to identify the molecular changes associated with malignant transformation and with longer survival [
3,
4]. The malignant transformation has been associated to several changes in gene expression patterns, which are critical to several steps of tumor progression [
5].
The noncoding RNAs (ncRNAs) exceed the number of protein-coding genes several fold [
6], and both microRNAs (21–24 nt) (miRNAs) and long ncRNAs (≥ 200 nt) (lncRNAs) are now emerging as mammalian transcription key regulators in response to developmental or environmental signals [
7‐
9]. The lncRNAs are classified based on intersection with protein-coding genes; when they map outside a protein-coding
locus they are denominated long intergenic ncRNAs (lincRNAs) [
9]. Otherwise they are classified as intronic, and in this case they can be either sense or antisense with respect to the direction of transcription of the host protein-coding gene in the
locus[
9].
Following the first reports of miRNA expression profiles associated with different types of cancer [
10,
11], several independent studies over the past five years identified a number of miRNAs differentially expressed in RCC that are correlated with malignancy [
12‐
18] and with RCC subtypes classification [
19,
20]. In addition, a metastasis signature comprehending four miRNAs was recently described for ccRCC [
21].
It has become evident that not only miRNAs but also lncRNAs are important players in cancer [
22‐
27]. Studies on lncRNA expression have mainly been focused on the lincRNAs [
28,
29], essentially to simplify their analysis by avoiding possible complications arising from overlapping protein-coding genes [
30]. Thus, recent transcriptome sequencing showed that lincRNAs are aberrantly expressed in a variety of human cancers [
31]. A transcriptome sequencing study over a prostate cancer cohort identified the lincRNA
PCAT1 as implicated in malignancy progression [
32]. In human lung adenocarcinoma, another lincRNA,
MALAT1, has been associated with tumor metastasis [
33] and is overexpressed in five other types of human cancers [
34]. In a rare subtype of RCC, namely t(6;11) RCC, it has been described that
MALAT1 is fused to
TFEB gene [
35,
36]. Recently, it has been shown that
Xist lincRNA is a potent suppressor of hematologic cancer in mice [
37].
Intronic lncRNAs constitute the major components of the mammalian ncRNA transcriptome [
38], and the intronic lncRNAs are possibly related to a fine-tuning regulation of gene expression patterns across the entire genome [
39]. Although thousands of putative intronic lncRNAs have been identified [
9,
38,
40,
41], it is yet to be determined which ones are functional. Also, it is a challenge to determine which ones are either independently transcribed or are by-products of pre-mRNA processing, with the levels of some of their intronic portions being independently regulated [
38,
42]. In fact, the mechanism of action of only a few intronic lncRNAs has been characterized in the context of cancer [
42‐
44]. In addition, there is a number of studies reporting the correlation of expression patterns of intronic lncRNAs with cancer, such as intronic lncRNAs correlated to the degree of tumor differentiation in prostate cancer [
45], intronic lncRNAs differentially expressed in primary and metastatic pancreatic cancer [
46] and in dasatinib-treated chronic myeloid leukemia patients with resistance to imatinib [
47]. In breast and ovarian cancer, Perez et al. [
48] identified 15 aberrantly expressed ncRNAs, of which at least three are intronic [
48]. In renal carcinoma, there are sparse studies regarding long noncoding RNAs. Our group previously identified seven intronic lncRNAs significantly deregulated in a set of six ccRCC tumor samples when compared with adjacent nontumor tissues [
49]. Using a microarray approach, another study revealed tumor-associated lincRNAs when comparing gene expression profiles in six pairs of ccRCC and adjacent nontumor tissues [
50].
In the present work, our study focused on the analysis of unspliced intronic lncRNAs, the class of lncRNAs that is the least studied one, in an attempt to point to possible new key molecules and pathways involved in renal carcinogenesis. In order to analyze gene expression patterns in tissue samples from RCC patients, we used herein two different microarray platforms enriched with probes for these intronic lncRNAs. We identified intronic lncRNAs whose differential expression was significantly correlated with RCC malignancy or with patient survival outcome. We also identified sets of intronic lncRNAs that are co-regulated in cis or in trans with protein-coding mRNAs encoding genes associated with transcriptional regulation and with kidney functions. Finally, our data demonstrate that RCC-expressed lncRNA loci are significantly associated with CpG islands and histone regulatory modifications typical of active RNA Pol II-transcribed genes, and that the intronic lncRNAs expression pattern in RCC is markedly tissue-specific and evolutionarily conserved.
Discussion
In the present study, we determined the expression pattern of a collection of intronic lncRNAs in clear cell RCC patients and identified candidates that might play a role in renal cancer biology. There are only two published studies of lncRNAs in RCC so far: our previous study [
49] that identified for the first time seven intronic lncRNAs differentially expressed in RCC among a protein-coding gene signature; and the work of Yu
et al. that identified 626 lncRNAs differentially expressed between tumor and nontumor tissue in 6 clear cell RCC patients. These authors used a microarray that essentially probed intergenic lncRNAs [
50] and they validated by qPCR four transcripts, being three intergenic lncRNAs (ENST00000456816, X91348 and NR_024418); one was not a lncRNA, but rather the non-coding 3′-end portion of the
TMEM72 protein-coding gene (BC029135).
We identified 29 lncRNA transcripts originated from intronic regions and additionally 11 from intergenic regions, resulting in a ccRCC-associated gene expression profile comprised exclusively of lncRNAs. From this set, there are three intronic lncRNAs from the
ACTN4,
HDAC5 and
SLC2A1 loci identified as down-regulated both here and in our previous study [
49] using the same microarray platform. This partial overlap (3 out of the 6 intronic lncRNAs described in Ref. [
49]) is possibly related to the more stringent statistical criteria presently used, namely the leave-one-out approach that minimizes the contribution of each individual patient to the set of significantly altered genes when a small patient cohort is analyzed [
51,
72].
The comparison of our 217 protein-coding gene profile with nine published studies of differentially expressed protein-coding genes in ccRCC [
5,
49,
52‐
58] verified that the vast majority (83%) of the genes in common (142/170) presented a concordant pattern of expression (Table
2), thus validating the present analysis as representative of the ccRCC biology.
Besides a set of intronic lncRNAs potentially involved in carcinogenesis, the present study identified a set of 26 intronic lncRNAs that were correlated to the survival of ccRCC patients. From this set, eight lncRNAs were identified as altered in both the malignancy and the survival outcome expression profiles, and they are transcribed from the
loci:
ACTN4, CSNK1D, DNAJC3, GIGYF2, HDAC5, PTPN3, RAB25 and
VPS13B. To the best of our knowledge, this is the first study suggesting lncRNAs as correlated to the patient survival outcome in RCC. Regarding other types of ncRNAs, there are at least two miRNA expression studies that had identified candidates correlated with patient survival outcome in RCC [
21,
73]. The lncRNAs identified in the present work may contribute to future studies focusing on lncRNAs as molecular markers in RCC oncology.
There are few examples of well-characterized lncRNAs associated with RCC. The lincRNA
GAS5 is a well described tumor suppressor in breast cancer [
74], and very recently it was described in prostate cancer cell lines [
75] and in RCC [
76]. A decreased expression of the lincRNA
GAS5 is associated to RCC genesis and progression, and its overexpression is associated to cell proliferation inhibition and apoptosis induction [
76]. Another example includes two antisense lncRNAs at the 5′ (5′aHIF-1α) and 3′ (3′aHIF-1α) ends of the human
HIF-1α gene that are expressed in human kidney cancer tissues [
77].
In cancer, there are a few examples of the mechanisms of action of intronic lncRNAs. Our group described the intronic antisense and unspliced lncRNA
ANRASSF1 that causes the epigenetic
in cis downregulation of the tumor suppressor
RASSF1A gene and increases cell proliferation [
43], and its expression is higher in prostate and breast cancer cell lines compared with nontumor cells [
43]. Guil
et al.[
42] identified that overexpression of the sense intronic lncRNA from the
SMYD3 locus caused the epigenetic
in cis regulation of
SMYD3 and a decrease in colorectal cancer cell line proliferation [
42]. The androgen-regulated intronic antisense lncRNA
CTBP1-AS[
44] appears to be a key antisense ncRNA that acts as both
cis- and
trans- regulator of gene expression. The
CTBP1-AS lncRNA promotes prostate cancer growth through sense-antisense repression of the transcriptional co-regulator
CTBP1 transcribed from the same
locus (
cis- regulation), and through a global epigenetic regulation of tumor suppressor genes (
trans- regulation) [
44]. In fact, the intronic and also the intergenic lncRNAs play important epigenetic roles in cancer [
78].
We decided to study the intronic lncRNA
ncHDAC5 in more detail because it showed a decreased expression in ccRCC tumor compared with nontumor tissue that was confirmed by qPCR, and because its increased expression seems to be associated to the cancer-related death after surgery in RCC, as suggested by our patient survival outcome analysis. We determined that
ncHDAC5 is an unspliced long transcript (1.7 kb long), detected in the antisense and sense directions relative to the protein-coding gene histone deacetylase 5 (
HDAC5). It has a short half-life of 42 min compared with other well studied lncRNAs, such as
Air,
Kcnq1ot1 and
Xist, which have half-lives of 2.1, 3.4 and 4.6 h, respectively [
79], with an evolutionarily conserved secondary structure. The absence of association between the expression of
ncHDAC5 and the protein-coding mRNA
HDAC5, determined by qPCR and by a meta-analysis of five kidney cancer gene expression studies (Table
1), suggests a
locus independent function, with the
ncHDAC5 possibly acting in
trans to regulate protein-coding genes (see the discussion on
trans regulation below). Unfortunately, a probe for this
ncHDAC5 was not present in the 44 k oligoarray that was used for assessing the
trans correlation of expressed lncRNAs/mRNAs, and it was not possible to determine the
ncHDAC5 candidate target mRNAs by our co-expression analysis.
An in silico analysis indicated the presence of RNA Pol II binding and of the histone marks H3K27ac and H3K4me3 at ~1.5 kb upstream of the putative TSS of an antisense
ncHDAC5 transcript in the
HDAC5 locus. Considering the lack of methylation marks in the vicinity of the lncRNA, this observation opens an interesting possibility of transcriptional regulation of the antisense lncRNA
ncHDAC5 by histone acetylation. It is in line with the result recently described for the lncRNA-LET, a lncRNA generally downregulated in carcinomas, that was shown to be repressed by histone deacetylase 3 under hypoxic conditions [
80]. Interestingly, the transcriptional-activation-associated H3K4me1 and H3K27ac histone modification marks at human enhancers have been described as related to a cell-type specific protein-coding gene expression [
81]. The TSSs at the lncRNA
ncHDAC5 locus as well as at the
loci of the other intronic antisense lncRNAs expressed in RCC were enriched with both histone marks, in agreement with the fact that the intronic lncRNAs tend to have a tissue-specific pattern of expression [
9], thus supporting a possible cell-type specific modulation of intronic antisense lncRNAs by histone methylation and acetylation.
Because the intronic lncRNAs revealed a promising well-defined pattern of altered expression in RCC, and there is scarce data about this ncRNA class in RCC, we extended our study to the antisense intronic lncRNAs using a custom-designed strand-specific 44 k-element microarray that contained 15-fold more probes for lncRNAs than the 4 k-array that we had previously used. With this new platform, we identified 4303 antisense intronic lncRNAs expressed in RCC; we found that 4061 out of the 4303 antisense lncRNAs have not been previously reported in the Yu et al. study [
50] as being expressed in RCC, which is in agreement with the fact that Yu et al. [
50] used a microarray that probed mostly intergenic lncRNAs. In addition, only six lncRNAs are already annotated as RefSeq noncoding RNAs (Additional file
8: Table S6). In fact, the most recent catalog of human intronic lncRNAs comes from the GENCODE project [
9], which documented the intronic lncRNAs expressed in 12 human normal tissues. Thus, the present study is a contribution towards the generation of a catalog of intronic antisense lncRNAs expressed in renal cancer.
The set of 4303 intronic antisense lncRNAs expressed in renal cancer identified in the present study probably has diverse functions, other than being precursors of small RNAs, because only one lncRNA mapped to a known small RNA sequence (U99, Additional file
8: Table S6). We found that 22% of the intronic antisense lncRNAs have expression levels in RCC, normal kidney, normal liver and tumor prostate that are correlated in
cis to the expression levels of the mRNA from the same
locus. These lncRNAs correlated in
cis are transcribed from
loci enriched with genes related to regulation, including the term “Regulation of Transcription from RNA polymerase II”, as seen when analyzing together the positively and negatively
cis-correlated antisense lncRNA/mRNA as well as when analyzing only the positively
cis-correlated transcripts (Additional file
9: Figure S3). Our group has described a similarly enriched GO term when analyzing the host gene
loci of the 30% most abundant intronic antisense lncRNAs, without considering any expression correlation between the ncRNAs and the mRNAs [
40]. Now we point to this GO term enrichment for those
loci expressing the antisense lncRNAs and the mRNAs in a correlated manner, reinforcing the suggestion that the lncRNAs might
cis-regulate the expression of the genes involved in “Regulation of Transcription” and/or that the antisense lncRNAs and the mRNAs might be controlled by a similar regulatory event in these
loci.
We found that the expression of the majority of the intronic antisense lncRNAs was not correlated to the expression of the mRNA from the same
locus, and those are most likely regulated in an independent way of the mRNAs. Among these, we identified a set of antisense lncRNAs whose expression in RCC, normal liver, prostate tumor and kidney nontumor tissues was positively or negatively correlated in
trans to the expression levels of sets of mRNAs belonging to enriched GO terms such as “Inflammatory response” and “Response to stress”; these protein-coding genes may be related to the cellular renal cancer context, and the correlated lncRNAs are candidates to be acting in
trans to regulate their expression. The present GO analyses support the proposal that ncRNAs might be part of a fine-tuning regulatory network in the cells [
82‐
84].
Our computational analysis has generated a list of 4303 intronic antisense lncRNAs expressed in RCC (Additional file
8: Table S6) that includes subsets associated to CpG islands, CAGE tag marks, RNA pol II binding site, promoter-associated chromatin marks, tissue-specificity and evolutionary conservation. The set of 53 intronic antisense lncRNAs expressed in common at syntenic
loci in human and mouse represent good candidates for subsequent in-depth biological follow up work; the low overlap may be related to the known tissue-specific expression of lncRNAs [
8,
41] and to the known tissue-pattern of expression conservation among different species [
85], considering that StLaurent et al. [
38] used mouse lung tissues and we have used human kidney tumor tissues. Although lncRNAs are much less conserved than other functional ncRNAs such as miRNAs and snoRNAs [
86], there is good evidence in the literature regarding the presence among the intronic lncRNAs of evolutionarily conserved regions spanning 400 nt or more [
39,
85,
87]. Our recent work with pancreatic cancer has identified an enrichment of conserved regions within intronic and intergenic lncRNAs [
46], and here we extend the identification of conserved regions to the intronic antisense lncRNAs expressed in RCC. Although some of the introns could contain regulatory sequences, or yet undiscovered coding exons overlapped by the intronic RNAs, thus accounting for part of the enrichment signal, the observed primary and secondary structure conservation suggests that the intronic lncRNAs are under the influence of evolutionary constraints.
In silico approaches have been successfully used to characterize sets of lncRNAs expressed in other tissues or cell lineages [
9,
28,
29,
46,
69]. Here, we used them to obtain new data indicating that intronic lncRNAs should not be regarded simply as by-products of random transcription [
38], but rather as a diverse and heterogeneous class of cellular transcripts that may comprise yet uncharacterized regulatory RNAs. The intronic lncRNAs identified here as expressed in RCC may have several mechanism of action, both positively and negatively regulating gene expression, and as a consequence, may constitute a promising starting point for further functional investigations.
Authors’ contributions
AAF designed the study, carried out microarray and RT-PCR experiments, performed in silico analyses and drafted the manuscript; ACT participated in microarray experiments and performed in silico analyses; SAVA participated in RT-PCR and performed ncHDAC5 characterization experiments; VMC performed in silico analyses; EG obtained clinical patient information; GV carried out histological tissue classification; FSC helped to conceive the study and obtained patient agreement and tissue samples for the study; EMR participated in the design and coordination of the study; SVA conceived and coordinated the study, and drafted the manuscript with input from all authors. All authors read and approved the final manuscript.