Introduction

Genome-wide surveys have revealed that eukaryotic genomes are extensively transcribed into thousands of long and small non-coding RNAs (ncRNAs) [25, 36, 66]. Small ncRNAs are about 20–30 nucleotides long and include the small interfering RNAs, microRNAs (miRNAs), and Piwi-interacting RNAs [3, 22, 58]. Long non-coding RNAs (lncRNAs) are a class of mRNA-like transcripts ranging in length from 200 nt to over 100 kb lacking any significant open reading frames [33, 46]. On the basis of their position relative to protein-coding genes, lncRNAs can be divided into five broad categories: (1) sense, (2) antisense, (3) bidirectional—when the expression of it and a neighboring coding transcript on the opposite strand is initiated in close genomic proximity, (4) intronic—all lncRNAs come from the introns of another transcript, and (5) intergenic—lncRNAs are located between two genes [37]. In contrast to miRNAs which have been intensively studied, although the vast majority of lncRNAs have yet to be characterized thoroughly, there are probably many functions of lncRNAs awaiting discovery. Studies have shown that significant numbers of lncRNAs play a lot of multi-functional roles in the regulation of gene expression in a number of ways (Fig. 1) [21, 27, 62]. New evidence is indicating that dysfunctions of lncRNAs are associated with human diseases and cancer [11, 16, 17, 49]. Here, the recent research progress of lncRNAs was summarized aiming at understanding the functions of these novel regulatory molecules.

Fig. 1
figure 1

Paradigms for functions of lncRNAs

lncRNAs and functions

lncRNAs and transcriptional regulation

Divergent mechanisms of the transcriptional regulation by lncRNAs have been researched. The cyclin D1 (CCND1) promoter can produce low-copy transcript lncRNAs in human cell lines which can be combined to the 5 ′end regulatory region of the CCND1 that are induced in response to DNA damage signals for the recruitment of the translocated-in-liposarcoma protein to the CCND1 promoter to cause gene-specific expression inhibiting the CREB-binding protein and p300 histone acetyltransferase activities [59]. lncRNAs can also effect global changes by interacting with basal components of the RNA polymerase II (Pol II)-dependent transcription machinery. lncRNAs that interact with Pol II machinery are typically transcribed by Pol III, thereby decoupling their expression from the Pol II-dependent transcription reaction they regulated [20, 37]. Human Alu RNA, transcribed from short interspersed elements, is a transacting transcriptional repressor during the cellular heat shock response. Alu RNA blocks transcription by binding Pol II and entering complexes at promoters [35]. Antisense (AS) transcripts are often referred to as lncRNAs [39]. In embryonic zebrafish, tyrosine kinase containing immunoglobulin and epidermal growth factor homology domain-1 (tie-1), tie-1AS lncRNA transcript selectively binds tie-1 mRNA and regulates tie-1 transcript levels, resulting in specific defects in the endothelial cell. The results directly implicate long non-coding RNA-mediated transcriptional regulation of gene expression as a fundamental control mechanism for physiologic processes, such as vascular development [32]. lncRNAs regulate transcription indirectly by controlling the subcellular localization of transcription factors. The lncRNA repressor of the nuclear factor of activated T cells (NFAT) (NRON) affects the localization of the transcription factor NFAT perhaps by interactions with nuclear transport factors. Knockdown of the NRON results in increased NFAT in the nucleus and increased NFAT activity [61]. In mice, a 2.4-kb unspliced, polyadenylated nuclear-retained ncRNA from chromosome 8 known as MRHL is processed by Drosha to yield an 80-nt small RNA which could be processed further to a 22-nt small RNA by Dicer in an in vitro reaction [15].

Enhancers are classically defined as distal regulatory genomic elements. Recent research suggests that several enhancers function through lncRNAs [8]. The Evf-2 lncRNA is an alternatively spliced form of Evf-1, shown to specifically cooperate with the Dlx-2 to increase the transcriptional activity of the Dlx-5/6 enhancer, suggesting a transacting enhancer activity of the lncRNA Evf-2 [13]. lncRNAs with enhancer-like functions have been demonstrated in several human cell lines; for example, the expression of TAL1 is shown to be enhanced by lncRNA-a3 [41]. Yang et al. found that lncRNA-HEIH is an oncogenic lncRNA and associated with the enhancer of zeste homolog 2 to promote hepatocellular carcinoma progression [64]. Human maternally expressed gene 3 (MEG3) is an mRNA-like RNA, lncRNA, with a length of ∼1.6 kb nucleotides. MEG3 activates the expression of Tp53 and enhances its binding affinity to the promoter of its target gene, Gdf15, implying a role for MEG3 in regulating the expression and transcriptional activation of Tp53 [67].

lncRNAs also allow highly specific interactions that are amenable to regulating various steps in the post-transcriptional processing of mRNAs in their splicing. For example, natriuretic peptide precursor type A (NPPA) exists as an antisense transcript (NPPA-AS) [2]. NPPA-AS effects alternatively spliced isoforms of NPPA mRNA. It is speculated that NPPA-AS and NPPA can form RNA duplexes and effect the NPPA mRNA splicing process suggesting that antisense transcription might be an important post-transcriptional mechanism modulating NPPA expression [2]. AS lncRNAs can mask key cis elements in mRNA by the formation of RNA duplexes, as in the case of the Zeb2 antisense RNA, which complements the 5′ splice site of an intron in the 5′ UTR of the zinc finger Hox mRNA Zeb2 [5].

lncRNAs and epigenetics

The initial study of lncRNA epigenetics in mammalian cells originated from the genomic X chromosome inactivation and genomic imprinting [9, 63]. lncRNAs mediate epigenetic changes mainly by recruiting chromatin remodeling complexes to specific genomic loci. One of the first lncRNA genes reported was the imprinted H19 gene, which was quickly followed by the discovery of the silencing X-inactive-specific transcript [14]. The H19 gene encodes a 2.3-kb lncRNA that is expressed exclusively from the maternal allele. H19 and its reciprocally imprinted protein-coding neighbor, the insulin-like growth factor 2 or IGF2 gene at 11p15.5, were among the first genes and were found to demonstrate genomic imprinting [14]. lncRNA Xist transcript plays a critical role in the regulation of imprinted and random X inactivation. X chromosome inactivation is mediated by the iconic lncRNA, Xist [4, 6, 7]. During female development, Xist is expressed from the inactive X chromosome and coats the X from which it is transcribed, leading to chromosome-wide repression of gene expression. An overlapping antisense lncRNA called Tsix represses Xist expression in cis, while the lncRNA Jpx, whose expression accumulates during XCI, activates Xist on the inactive X. Hox transcript antisense RNA (HOTAIR) originates from the HOXC locus and silences transcription across 40 kb of the HOXD locus in trans by inducing a repressive chromatin state, which is proposed to occur by recruitment of the polycomb chromatin remodeling complex PRC2 by HOTAIR [51]. In vertebrates, Hox genes, encoding homeodomain transcription factors critical for positional identity, are clustered in four chromosomal loci; the Hox genes are expressed in nested anterior–posterior and proximal–distal patterns colinear with their genomic position from 3′ to 5′of the cluster. HOTTIP is an lncRNA transcribed from the 5′ tip of the HOXA locus that coordinates the activation of several 5′ HOXA genes. Chromosomal looping brings HOTTIP into close proximity to its target genes. Thus, HOTTIP may organize chromatin domains to coordinate long-range gene activation [60]. A recent study suggests that Kcnq1ot1 (AS lncRNA) associates with chromatin forming a Kcnq1ot1 silenced nuclear subdomain [42]. Silencing appears to be caused by Kcnq1ot1-mediated recruitment of repressive chromatin remodeling complexes and DNA methyltransferases [38]. Epigenetic silencing of tumor suppressor gene promoters is also one of the most common observations found in cancer. For example, in human cells, p15AS associated with tumor suppressor genes acts to alter histone methylation to silence expression of this gene [65].

lncRNAs and disease, cancer

It has been shown that misexpression of lncRNAs contributes to numerous diseases. For example, an lncRNA may influence the pathogenesis of Alzheimer's disease. The BACE1-AS can regulate BACE1 mRNA expression. BACE1 mRNA expression is under the control of a regulatory non-coding RNA that may drive Alzheimer's disease-associated pathophysiology [12]. BC200 RNA is a 200-nt-long non-coding RNA that is expressed in the nervous system [57]. A knockout mouse for the BC1 ncRNA, localized to neurites, had altered behavior and reduced survival in an outdoor pen [31]. Recent studies have shown that lncRNAs may influence the pathogenesis of fragile X syndrome and fragile X-associated tremor and ataxia syndrome, which are, respectively, caused by mutation and pre-mutation in the protein-coding FMR1 gene. FMR4 is a primate-specific lncRNA that appears to share a bidirectional promoter with the FMR1 gene [29].

Recent studies suggest that the aberrant expression of lncRNAs has been associated with cancers, suggesting a critical role in oncogenesis [24]. The well-studied lincRNA, HOTAIR, is highly expressed in breast cancer metastases [15], and its overexpression in various breast carcinoma cell lines promotes invasion [47]. Enforced expression of HOTAIR results in an altered pattern of H3K27 methylation and increased cancer invasiveness [47]. Accumulating evidence indicated that prostate cancer non-coding RNA 1 (PCNCR1) lncRNA was identified in a “gene desert” on chromosome 8q24.2 and is associated with susceptibility to prostate cancer. PCNCR1 is expressed as an intronless, ∼13-kb transcript with a potential role in trans-activation of the androgen receptor (AR) involved in prostate carcinogenesis [10, 48]. Experiments with whole-genome tiling arrays, searching for long, conserved, abundantly expressed lncRNAs, revealed 15 transcripts whose expression was altered in breast and ovarian cancer; at least three of them are intronic [45]. In neuroblastoma cell lines, the posterior pituitary hormone oxytocin increased the levels of MALAT1 lncRNA a after stimulation. MALAT1 transcript levels peaked 6–24 h after stimulation. The high expression of MALAT1 lncRNA may be as a tumor marker [30]. Different lncRNAs were found to be overexpressed in various tumors such as Ewing's sarcoma as well as lung and breast carcinomas [26, 54].

Nevertheless, some lncRNAs exhibit tumor suppressive activities, such as lincRNAp21, a transcriptional target of the p53 tumor suppressor. lincRNAp21 is required for the global repression of genes that interfere with p53 function regulating cellular apoptosis. lincRNAp21 can mediate gene repression by physically interacting with the protein hnRNP-K, allowing its localization to promoters of genes to be repressed in a p53-dependent manner, and its overexpression in a lung cancer cell line sensitizes the cells to DNA damage-induced apoptosis [23]. lncRNA termed SPRY4-IT1 is derived from an intron of the SPRY4 gene and is predominantly localized in the cytoplasm of melanoma cells. Gain of function of this lncRNAs attenuated cellular growth while increasing death suggesting a tumor suppressive role [28]. Growth arrest-specific transcript 5 (GAS5), another lncRNA, is part of the transcription complex. In breast cancer cells, GAS5 transcript levels were significantly reduced in breast cancer samples relative to adjacent unaffected normal breast epithelial tissues. Research also indicates that GAS5 is critical to the control of mammalian apoptosis and cell population growth, thereby enhancing cell susceptibility to apoptosis inducers. GAS5 could control the apoptosis and outgrowth of cells without the induction of other stimulators [40].

Viral lncRNAs and host defenses

Viral lncRNAs have been found in divergent families of DNA viruses. Epstein–Barr virus is a lymphotropic herpesvirus that encodes two abundant lncRNAs called EBV-encoded RNAs (EBERs) [53]. The EBERs loosely resemble the adenovirus-associated (VA) RNAs in size and structure. However, unlike the VARNAs, EBERs are strictly nuclear localized, thereby ruling out identical functions in inhibiting cytoplasmic protein kinase R. EBERs can prevent interferon-mediated apoptosis in some cell types19. Thus, VARNAs and EBERs are two viral lncRNAs that have established the precedence of viral ncRNA inactivation of host defenses (Fig. 2) [56]. Recently, Reeves et al. used a protein affinity approach, and the function of another lncRNA has been determined. HCMV encodes β2.7, a non-coding RNA that accounts for a large fraction (∼20 %) of the transcripts that are expressed during lytic infection [50].

Fig. 2
figure 2

Functions of viral lncRNAs in evading host defenses

lncRNAs and stem cell development

Cellular reprogramming demonstrates the remarkable plasticity of cell fates, as illustrated by the isolation of induced pluripotent stem cells (iPSCs) from fibroblasts. To date, it is not known whether large-scale transcriptional changes induced by reprogramming apply to lncRNAs and whether these changes have any functional relevance. Loewer et al. found that large numerous lncRNAs occur upon derivation of human iPSCs and lncRNAs whose expression are linked to pluripotency [34]. These lncRNAs are direct targets of key pluripotency transcription factors. Recently, studies showed that one such lncRNA (lncRNA-RoR) that modulates reprogramming was identified using loss-of-function and gain-of-function approaches, thus providing a first demonstration for critical functions of lncRNAs in the derivation of pluripotent stem cells [34]. lncRNAs can also help regulate development by physically interacting with proteins to coordinate gene expression in embryonic stem cells (ESCs) and suggest that lncRNAs may play key roles in the circuitry controlling ESC state [19]. Guttman et al. have discovered that a mysterious class of lncRNAs plays a central role in embryonic stem development, contrary to the dogma that proteins alone are the master regulators of this process. The research reveals that lncRNAs orchestrate the fate of ESCs by keeping them in their fledgling state or directing them along the path to cell specialization [19].

lncRNAs and immunity

With the advent of next-generation sequencing technologies, whole-transcriptome analysis of the host response, including long ncRNAs, is now possible. Relevant evidences also showed that lncRNAs were associated with diverse biological processes across different tissues and involved in the host response to viral infection and innate immunity [18]. For example, using cDNA microarrays, Pang et al., using a custom 70-mer microarray, showed that lncRNA probes had altered expression during CD8+ T cell differentiation upon antigen recognition [43]. Ahanda et al. identified eight mRNA-like lncRNAs, of which some were differentially expressed in virus-infected birds [1]. Likewise, Peng et al. performed whole-transcriptome analysis of severe acute respiratory syndrome coronavirus-infected lung samples via deep-sequencing technology. Results show that there was a widespread differential regulation of lncRNAs in response to viral infection, suggesting that these lncRNAs are involved in regulating the host response, including innate immunity [44].

Perspectives

Studies of gene expression in eukaryotes have begun to rediscover the lncRNAs' world. Relative to the protein-coding sequence and a variety of small molecule RNA, lncRNA research is still only in its infancy, and the functional role for the vast majority of these lncRNA genes is still in question. In early studies, lncRNAs act mainly as regulators in protein synthesis. But now, it is quickly becoming clear that lncRNAs can have numerous molecular functions. lncRNAs may serve as molecular signals and can act as markers of functionally significant biological events [52]. lncRNAs may potentially be utilized for developing novel diagnostic and therapeutic strategies to disease disorders and are becoming recognized as a hallmark feature in disease and cancer. Different lncRNAs engage diverse mechanisms that lead to different regulatory outcomes. The spatio-temporal expression profiles of some lncRNAs indicate they may have magnificent physiological and biochemical functions. More recently, studies showed that a conserved lncRNA called yellow-achaete intergenic RNA (yar) is found in Drosophila in the cytoplasm of cells. It is required for the regulation of sleep. Loss of yar alters sleep regulation in the context of a normal circadian rhythm [55]. To sum up, intense investigation of the lncRNA transcription will likely expand our understanding of both the cell biology and functions of lncRNAs.