Background
Long non-coding RNAs (lncRNAs) are non-coding RNAs ranging in length from 200 nucleotides to ~100 kilobases [
1]. LncRNAs are implicated in a variety of biological processes and deregulation of lncRNAs may act as biomarkers and therapeutic targets for cancer [
2]. Many studies identify the cancer-related lncRNAs using differential expression analysis methods, such as T-test, EdgeR [
3] and DESeq [
4], which are designed to detect the population-level differentially expressed (DE) lncRNAs. Although some methods, such as Maximum Ordered Subset T-statistic (MOST) [
5], Cancer Outlier Profile Analysis (COPA) [
6], Outlier Sums (OS) [
7] and Outlier Robust T-statistic (ORT) [
8], have already been proposed to detect differentially expressed genes (DEGs) in sub-groups of cancer samples, considering the high heterogeneity of lncRNA expression among patients, none have been used in detecting DE lncRNAs in individual patients. Recently, our research group has successfully developed new methods to detect patient-specific differential expression information [
9,
10]. We have revealed that the relative expression rankings of genes (miRNAs) tend to be highly stable in specific normal human tissues but widely disturbed in the corresponding cancer tissues, and the reversal relationship of rank between genes (miRNAs) expression level can be used to identify DE genes (miRNAs) in individual patient. The advantage of the present relative ordering-based method is that it is insensitive to batch effects and data normalization and thus can directly utilize data from different datasets [
9‐
11]. Thus, by evaluating the lncRNA expression profiles in this study, we proposed a new method (
LncRIndiv) to detect DE lncRNAs in individual patients, which has been improved based on our original methodology that were developed to detect DE miRNAs in individuals [
9].
Considerable efforts have been devoted to identify lncRNA prognostic signature for cancers using absolute expression profiles and risk score based methods [
2,
12,
13]. However, due to experimental batch effects and platform differences, the score-based signatures tend to produce spurious risk classification in independent samples measured by different laboratories and are infeasible in clinical application [
11]. Fortunately, we found prognostic signatures derived using the relative genes (miRNAs) expression rankings within samples, rather than the absolute expression values, are robust in independent datasets from different laboratories and platforms [
9,
10]. For example, our previous work found that the expression rank change of hsa-miR-29c with hsa-miR-30b can be used as biomarker of poor overall survival for breast cancer patients [
9]. Thus, the individual-level differential expression of lncRNA derived by the
LncRIndiv method could be applied to detect the prognostic signature for cancer.
Lung adenocarcinoma (LUAD) is one of the important sub-types of lung cancer with high morbidity and mortality [
14]. In this study, by a case study of LUAD, we demonstrated that
LncRIndiv could reach good performance for individual-level analysis of deregulated lncRNAs in independent paired normal-cancer samples. And, a significant proportion of up- or down-regulated DE lncRNAs showed concordance of amplified or deleted copy number alterations, providing evidence of the high reliability of the
LncRIndiv method. Based on the lncRNAs individual-level differentially expression analysis, we successfully developed a new prognostic signature (
C1orf132 and
TMPO-AS1) for stage I and II LUAD patients without adjuvant therapy. This new signature does not rely on pre-setting thresholds for prognostic prediction and performed well in independent datasets.
Discussion
Aberrant expressions of lncRNAs in cancer patients have been comprehensively reported [
27]. The expression levels of lncRNAs across the patients in the same cancer type are also highly heterogeneous. Current methods to detect DE lncRNAs are based on the population rather than individuals. Based on our previous study of detecting the DE miRNAs in individuals [
9], we provided a new method
LncRIndiv to detect the DE lncRNAs in individual cancer patients, which is not limited by the platform, data normalization methods and batch effects. In the method
LncRIndiv, we used the CV of rank rather than the absolute expression levels of partner lncRNAs in our previous work [
9], which could avoid the batch effect from different datasets. Notably, absolute expression values rather than the rank can actually reflect the differential expression direction of each lncRNA itself in the pair-wise cancer and normal samples. Thus, we used the expression rank of lncRNAs in pair-wise sample to evaluate the performance of
LncRIndiv method. The
LncRIndiv performed well in the independent pairwise LUAD datasets and the simulation data.
In our work, the
LncRIndiv method also identified some DE lncRNAs that were well characterized by other studies (Additional file
3: Table S5, Additional file
4: Table S6, Additional file
5: Figure S1B and S2). For example, Hou
et al. revealed that enhanced expression of long non-coding RNA
ZXF1, known as
ACTA2-AS1 (Ensembl ID: ENSG00000180139.10) (Additional file
6: Table S11), promoted the invasion and migration of LUAD cells [
28].
LINC01207, also named as
RP11-294O2.2 (Ensembl ID: ENSG00000248771.1) (Additional file
6: Table S11), was significantly up-regulated in advanced LUAD and the siRNA mediated knockdown of
LINC01207 in A549 cell line could inhibit the cell proliferation [
29]. Some differential expression profile of lncRNAs in individuals could be partly validated by the copy number alterations of lncRNAs in individuals. As Yan
et al. pointed that the copy number alteration is an important mechanism that leads to the aberrant expression of lncRNAs in cancer [
27]. For example, the deregulation of lncRNA
BCAL8 showed positive correlation with its copy number alteration and was significantly associated with poor survival in breast cancer [
27]. In our results, 51.3% lncRNAs showed significantly consistent changes between copy number alteration and deregulation of expression in individual LUAD patients. Some DE lncRNAs with consistent copy number alteration in our results have been proved to be tumor suppressor or oncogenic lncRNAs in cancer (Additional file
6: Table S9). For example, Yao
et al. found that the down-expression of
ADAMTS9-AS2 resulted in a significant loss in the inhibition of glioma cell migration [
30]. These results not only suggested that the differential expression of lncRNAs in individuals could be owing to the copy number alteration of itself, but also could be evidence to support the high reliability of individual lncRNA differentially expressed profile derived by the
LncRIndiv method. Notably, the rest of lncRNAs without significant consistence between differential expression and copy number alterations maybe affected by mutation, methylation and so on, which warrants our future work.
Some studies use the average or median score or the expression level as cut-offs to distinguish high- and low-risk patients [
13,
31‐
33]. However, these methods are arbitrary in setting a threshold for prognostic signature detection and are difficult to apply to clinical experiments [
11,
34]. Our study reveals a robust 2-lncRNA signature for LUAD patients, which was validated in independent datasets and also by the GO enrichment analysis. In clinical translational application, for each individual LUAD patient, we only need to test whether the expression of
C1orf132 is lower than
IQCH-AS1,
RP11-589P10.5 and
LINC00938, or the expression of
TMPO-AS1 is higher than
PCBP1-AS1,
TCL6 and
RP11-333E1.1. By pathway analysis, our results suggest that the lncRNAs in the signature are involved in the poor prognosis of LUAD patients by deregulating the cell cycle and cell adhesion molecules pathways in cancer cells, which deserves our future detailed biological experiments. Notably, our results also found the stage is a factor that related with the prognosis of LUAD patients. However, as shown in Table
2, the multivariate cox analysis showed that the 2-lncRNA signature is independent of the clinical factor of stage.
In our study, we found the down-regulation of
C1orf132 was associated with the poor prognosis. The underline mechanism is still unclear. It has been proposed that lncRNAs can act as competing endogenous RNAs (ceRNAs) to influence miRNA activity and thereby regulate the target transcripts containing miRNA-binding sites [
35]. We supposed that
C1orf132 may act as ceRNA with the tumor suppressors
RBL2 and
CCND3, which have been showed with significant positive correlation with the expression of
C1orf132 in the (Fig.
3b and c). By integrating the lncRNA-miRNA interactions and miRNA-target interactions in databases of miRanda [
36], miRTarBase [
37], miRcode [
38] and TargetScan [
39], we found
C1orf132 was significantly competitively binding miRNAs with
RBL2 (
P = 2.38 × 10
−12, hypergeometric test) and
CCND3 (
P = 9.82 × 10
−5, hypergeometric test) (Additional file
6: Table S12). Some miRNAs, such as hsa-miR-93 [
40], hsa-miR-372 [
41], hsa-miR-424 [
42], have been reported the important roles in the progression of LUAD. Thus, we inferred that the down-regulation of
C1orf132 might release the miRNAs that targeted
RBL2 and
CCND3 and further promote the tumor progression, which warrant further in-depth experimental research.
Nevertheless, our present method also has some limitations. First of all, although the consistency score are relatively high,
LncRIndiv method may have insufficient power to detect all samples with differential expression of one lncRNA. We performed the
LncRIndiv method on the simulated data with large number of samples with pre-set DE lncRNAs, the sensitivity decreased as the increased number of DE samples (Additional file
6: Table S13), which indicates
LncRIndiv method may have insufficient power to detect all samples with one DE lncRNA. However, for each sample, though a certain number of DE lncRNAs may be missed, a significantly high proportion of lncRNAs show consistent expression changes with their copy number alterations, which indicates that the DE lncRNAs in individual patients captured by our method are true. Improving the power of
LncRIndiv warrants our future detailed work. Secondly, we used the pair-wise cancer and normal samples to evaluate the performance of
LncRIndiv method, which is lack of strict statistical justification. Thus, we further assessed the differential extent of lncRNAs identified by
LncRIndiv method, based on the hypothesis that the higher the differential extents are, the less the random errors are. The fold changes of lncNRAs in patients with DE lncRNAs were significantly higher than those patients without the DE lncRNAs (
P < 2.0 × 10
−16, T-test). As examples shown in Additional file
5: Figure S7, the patients with the DE lncRNA showed bigger difference with the paired normal samples in expression values than the patients without the DE lncRNA. Thirdly, our work only analyzed the overlapped lncRNAs between microarray and sequencing datasets. Because of the number of lncRNAs re-annotated from the microarray is limited, results showed that the number of DE lncRNAs in individual patients from microarray and sequencing datasets are different. Although some lncRNAs were lost in the microarray, the results derived by the
LncRIndiv method could reveal a new robust prognosis-related lncRNA signature for stage I or II LUAD patients without adjuvant therapy, which was validated in other independent microarray datasets. The
LncRIndiv method could also be used in other cancer types with abundant sequencing expression profile of lncRNAs. Finally, by KEGG pathway enrichment and correlations analysis between lncRNAs and DEGs, we found that the lncRNAs (
TMPO-AS1 and
C1orf132) could affect the prognosis of LUAD by deregulating cell cycle pathway genes. Although the results are interesting and meaningful, it is lack of biological experiments for further validation. We will continue to investigate the biological mechanisms that how the lncRNAs regulate the cell cycle genes during the carcinogenesis in our future work.
Acknowledgements
The authors acknowledge the efforts of all of researchers who have contributed the data to the public databases of GEO, TCGA, Lnc2Cancer and TANRIC. The interpretation and reporting of these data are the sole responsibility of the authors.