Background
In recent years, advances in the depth and quality of transcriptome sequencing have allowed for the rapid discovery of lncRNAs, and accumulated evidence shows that lncRNAs are functional transcripts rather than biological noise. LncRNAs regulate diverse cellular processes, including chromatin modification, transcription initiation, and co- and post-transcriptional regulation [
1]. A large number of studies has begun to expound the roles of lncRNAs in different biological systems, such as the reproductive, metabolic, and immune systems [
2]. The prognostic power of lncRNA signatures has been investigated in various cancers, including glioblastoma multiform [
3], colorectal cancer [
4], non-small cell lung cancer [
5] and esophageal cancer [
6].
Esophageal cancer is one of the most common malignant tumors worldwide [
7], nearly 450,000 new cases are diagnosed annually, and around 70% of the cases occur in China. Among the various histological subtypes of esophageal cancer, esophageal squamous cell carcinoma (ESCC) is the principal form in the vast majority of cases [
8]. Its overall 5-year survival is less than 20%,due to the difficulty of early detection and frequent metastatic recurrence [
9,
10]. For ESCC, the clinical staging system (pTNM) is the main prognostic indicator, but it has limited capacity in clinical practice. Increasing evidence has shown that mRNA or miRNA signatures are strong predictors of survival in patients with ESCC [
11‐
13], and whether lncRNA signatures have similar prognostic power in esophageal cancer has drawn recent interest. Tong et al. selected ten lncRNAs based on a previous study and identified lncRNA
POU3F3 in plasma as a novel biomarker for diagnosis of ESCC [
14]. Li et al., using a lncRNA expression profile microarray, revealed a three-lncRNA signature associated with the survival of ESCC patients [
6], and our research group combined protein-coding genes with long non-coding RNA to predict prognosis for patients with ESCC as a novel clinical multi-dimensional signature [
15], suggesting that lncRNAs can be promising prognostic biomarkers for ESCC for use in the clinic.
In a previous study, we reported lncRNA expression profiles in 15 paired ESCC tissues and adjacent non-tumor tissues via transcriptome RNA sequencing, and further developed a method, denoted URW-LPE, for Unsupervised Random Walk with each dysregulated LncRNA/PCG, to identify novel potential functional lncRNAs. A seed composed of each dysregulated lncRNA/PCG (protein coding gene), combined with an edge composed of an extended co-expression relation, was used as a random walk. Differentially-expressed lncRNAs and PCGs in ESCC were used to construct an extended lncRNA-PCG co-expression network and the random walk was run for the network and the fold change (FC) values of each node on the network was considered as the initial probability vector. Thus, each lncRNA in the network would be given an URWScore value and lncRNAs with the higher URWScore value would be expected to possess more important biological functions in ESCC [
16]. However, whether the potential lncRNAs could be used as prognostic biomarkers for ESCC needs to be further determined. In the present study, from the previously identified lncRNA biomarkers, we selected 10 higher-ranking lncRNAs, and then detected their expression using quantitative RT-PCR (qRT-PCR) in 138 patients with ESCC. Finally, we built another three-lncRNA signature (
RP11-366H4.1.1,
LINC00460 and
AC093850.2) that was highly associated with the overall survival and disease-free survival of patients with ESCC, and further validated its prognostic value in an independent cohort of 119 patients (GSE53624) from the Gene Expression Omnibus (GEO) database [
6].
Methods
Patients and tissue specimens
We collected paired tumor and adjacent non-tumor tissues from 138 patients with ESCC (between 2007 and 2009), from the Department of Oncological Surgery of the Central Hospital of Shantou City, P.R. China. Cases were selected in this study only if a follow-up was obtained and clinical data were available. The follow-up for patients after esophageal resection was continued until their deaths and only patients that died from ESCC were included in the tumor-related deaths. Patients, suffering from severe post-operative complications, other tumors or died of other causes were excluded. This study was approved by the Ethical Committee of the Central Hospital of Shantou City and the Medical College of Shantou University, and written informed consent was obtained from all surgical patients to use resected samples and clinical data for research. The tissue specimens were snap frozen in liquid nitrogen shortly after resection and stored at − 80 °C until RNA extraction.
Cell lines and culture conditions
Human esophageal cancer cell lines KYSE150, KYSE180, KYSE450, KYSE70, KYSE140 and TE3 were kindly provided by Dr. Ming-Zhou Guo (Chinese PLA General Hospital, Beijing, China) and grown in RPMI 1640 medium (Invitrogen, California, USA), with both media supplemented with 10% FBS (Invitrogen, California, USA). The human immortalized esophageal cell line NE2 was kindly provided by Professor Sai-Wah Tsao (University of Hong Kong, China) and grown in defined keratinocyte serum-free medium (Gibco, Grand Island, NY, USA) and Cascade Biologics® EpiLife® (Life Technologies, Grand Island, NY, USA) in a 1:1 mixture. All cell lines were cultured at 37 °C in 5% CO2 and 95% air.
Human samples or cell lines were lysed using TRIzol® (15596-018, Life Technologies, Carlsbad, CA, USA) and total RNA was released and further purified with a PureLinkTM RNA Mini Kit (12183018A, Life Technologies, Carlsbad, CA, USA) according to the manufacturer’s protocol. The purity and concentration of RNA were determined by OD260/280 using spectrophotometer (NanoDrop ND-2000).
Quantitative RT-PCR (qRT-PCR)
For qRT-PCR, the reverse transcription (RT) reactions were carried out with a PrimeScriptTM RT reagent kit with gDNA Eraser (RR047A, TaKaRa, Dalian, China) according to the manufacture’s protocol. Reverse transcriptase reactions contained 1 μg total RNA. The 20 μl RT reaction mixture was incubated in a 2720 Thermal Cycler (Applied Biosystems). Quantitative PCR reactions were then performed on an ABI 7500 with SYBR® Premix Ex TaqTM (RR420A, TaKaRa, Dalian, China) in a 20 μl reaction volume, which also contained 2 μl cDNA and 0.8 μl PCR primer mix (forward and reverse primers at a final concentration of 0.2 μM each). The reactions were incubated at 95 °C for 30 s, followed by 40 cycles of 95 °C for 5 s, and 60 °C for 34 s. The Ct value of each candidate lncRNA was then normalized to the expression value of β-actin. Relative expression levels of the lncRNAs were calculated using the 2
-ΔCt method. Specimens that had no amplification within 40 cycles were deleted. Sequences of primers for qRT-PCR of the lncRNAs are listed in Additional file
1: Table S1.
SiRNA transfection
Cells were transfected with siRNAs against RP11-366H4.1.1, LINC00460 and AC093850.2, with scrambled siRNA used as a negative control. The procedures for siRNA transfection were performed according to the X-tremeGENE siRNA transfection reagent instructions (Sigma-Aldrich, St. Louis, MO). The sequences for RP11-366H4.1.1 were sense: 5’-ACACACAUCCUAGUUCUUUdtdt-3′, and antisense: 5’-AAAGAAC UAGGAUGUGUGUdtdt-3′. The sequences for LINC00460 were sense: 5’-GUCACCCCGAUUUAUGUUAdtdt-3′, and antisense: 5’-UAACAUAAAUCGGGGUGACdtdt-3′. The sequences for AC093850.2 were 5’-GGACAAUGAAGACUGAACUdtdt-3′, and antisense: 5’-AGUUCAGUCUUCAUUGUCCdtdt-3′. The negative control siRNA was sense: 5’-UUCUCCGAACGUGUCACGdtdt-3′, and antisense: 5’-CGUGACACG UUCGGAGAAdtdt-3′.
After
RP11-366H4.1.1,
LINC460 or
AC093850.2 was subjected to individual knockdown, KYSE150 or KYSE70 cell migration and colony formation assays were performed as previously described [
16]. Briefly, at 24 h post transfection, cells were starved for 12 h with serum-free medium (Invitrogen, California, USA) and then 5 × 10
4 cells were plated in serum-free medium in the upper well of a transwell chamber (24-well insert; pore size, 8 μm; BD Biosciences, Franklin Lakes, NJ, USA), and the lower chamber containing medium with 10% FBS. After 48 h, cells in the top chamber were removed with a cotton swab and only cells that migrated through the pores were fixed and stained in haematoxylin solution (Sigma-Aldrich, St. Louis, MO, USA) and counted. For colony formation, 500 cells per well in 24-well plate were incubated in medium supplemented with 10% FBS for ten days, and then colonies were stained with haematoxylin solution and observed.
The URW-LPE method has been previously described in detail [
16]. Briefly, a seed composed of each dysregulated lncRNA/PCG (protein coding gene), combined with an edge composed of an extended co-expression relation, was used as a random walk. Differentially-expressed lncRNAs and PCGs in ESCC were used to construct an extended lncRNA-PCG co-expression network and the random walk was run for the network and the fold change (FC) values of each node on the network was regarded as the initial probability vector. The random walk was represented according to the formula: p
t + 1 = (1-r) Wp
t + rp
0. W is represented by the adjacency matrix in the lncRNA-PCG co-expression network, p
t is a vector representing the probability of the corresponding lncRNA /PCG nodes at step t and p
0 is used as the initial probability vector. Thus, each lncRNA in the network would be given an URWScore value and lncRNAs with a higher URWScore value may possess more important biological functions in ESCC. In the lncRNA-PCG co-expression network, protein-coding genes highly associated with the higher URWScoring lncRNAs (Pearson correlation coefficient > 0.40,
P < 0.05) were selected. The association of the lncRNAs with potential protein-coding genes was visualized by Cytoscape_v2.8.3 software [
17]. GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway function enrichment analyses for the co-expressed protein-coding genes were performed according to the DAVID database on line (
https://david.ncifcrf.gov/) [
18].
Statistical analysis
The 138 specimens were randomly separated into a training set (
n = 77) and test set (
n = 61). A multivariable Cox regression model in the training set, including age, gender, histologic grade, invasive depth, lymph node metastasis and therapies, was constructed [
19]. Comparisons between the two sets for clinicopathological characteristics was performed using the
t-test, Fisher’s exact test and chi-squared test. Comparisons of the relative expression between tumor and paired adjacent normal tissues were performed using paired a
t-test. Overall survival (OS) was measured from the date of surgery to death or the latest follow-up. Disease-free survival (DFS) was measured from the date of surgery to the first occurrence of any of the following events, including recurrence, distant metastasis or death from any cause without documentation of a cancer-related event [
20]. The optimal cut-off point of lncRNA expression (2
-ΔΔCt, ΔΔCt = ΔCt tumor – ΔCt normal, ΔCt = Ct (selected lncRNA) – Ct (β-actin)) and risk score were assessed by the X-tile program [
21]. According to the cutoff value, the relative levels of lncRNAs from 138 paired ESCC samples and adjacent normal tissues was divided into high or low expression groups using X-tile and then probabilities of OS and DFS patients with ESCC were calculated by Kaplan-Meier analysis and compared using the log-rank test with SPSS19.0 (IBM, Armonk, New York, USA). A two-tailed
P-value less than 0.05 was considered to have statistical significance. All analyses were performed using SPSS 19.0 (IBM, Armonk, New York, USA) for Windows.
Discussion
The seventh edition of the AJCC staging system (p-TNM stage) is the only appropriate reference for predicting the prognosis of patients with ESCC. However, the staging system still needs to be modified in some aspects, and predicting ability needs to be improved further because of the dismal 5-year survival rate [
22]. Therefore, there is an immense clinical need for prognostic biomarkers of ESCC. Recent studies demonstrated that the combination of several biomarkers had better predictive ability than individual biomarkers. Different types of prognostic signatures have been identified, including protein-coding gene signatures and non-coding gene signatures. There are protein-coding gene signatures that are highly predictive of ESCC survival in both generation and validation datasets, such as the combination of
EGFR,
p-Sp1, and
fascin; GASC1-targeted genes
PPARG, MDM2, and NANOG; and a panel of
Annexin II,
kindlin-2, and
myosin-9. However, the protein-coding gene signatures are inadequate to precisely predict clinical outcome of ESCC [
11,
23]. MicroRNAs (miRNAs) have their own advantages, for use in testing for specific biomarkers in formalin-fixed, paraffin-embedded (FFPE) tissues and bodily fluids, such as being small in size, containing a stem-loop structure, and being more stable than mRNAs [
24]. A recent article reported a four-miRNA signature (composed of hsa-miR-218-5p, hsa-miR-142-3p, hsa-miR-150-5p, and hsa-miR-205-5p) to predict ESCC patient survival [
12].
Although miRNA and mRNA prognostic signatures robustly predict the survival of patients with ESCC, lncRNA signatures might help to predict the survival of patients more accurately than previously possible. In the present study, we found another three-lncRNA signature (
AC093850.2,
LINC00460 and
RP11-366H4.1.1) in 138 paired ESCC tissues and adjacent normal tissues and robustly predicted the survival of patients. Furthermore, we analyzed the expression of the three lncRNAs in another 18 types of cancers from data derived from the TCGA database and found that only
AC093850.2 and
LINC00460 are associated with breast invasive carcinoma (BRCA) patient survival, whereas
LINC00460 and
RP11-366H4.1.1 are associated with head and neck squamous cell carcinoma (HNSC) patient survival (Additional file
1: Table S3 and Figure S1). This implies the three-lncRNA signature might be specific to ESCC. By the application of the three-lncRNA signature to a test set of 61 patients with ESCC, we observed patients with a low-risk three lncRNA signature in their tumor specimens have longer overall survival than patients with a high-risk signature. The prognostic value of this three-lncRNA signature was further verified in an independent cohort of 119 patients with ESCC.
In the present study, we selected 10 lncRNAs, with high URW-LPE scores obtained from a previous study, for use in further identifying a three-lncRNA signature associated with overall survival and disease-free survival. However, only 3 lncRNAs could actually be confirmed. In the previous study, a lncRNA-PCG co-expression network was constructed using differentially-expressed lncRNAs and known protein coding genes in ESCC [
16]. Therefore, based on URWScore, the selected lncRNAs may be associated with cancer cell proliferation, metastasis, differentiation, angiogenesis and survival time of patient with esophageal cancer. In our previous and present study, lncRNAs with a higher URWScore were determined by the comparison of the levels of lncRNAs in paired ESCC samples and adjacent normal tissues, and the survival times of patients were correlated. It is possible that the unconfirmed lncRNAs in this paper are involved in cancer cell differentiation, angiogenesis. Also, as for our algorithms, like other reported algorithms, there is the possibility of potential flaws.
Ten lncRNAs with higher URWScores were selected, and the association of these lncRNAs with the prognosis of OS and DFS of patients with esophageal cancer in the training set were analyzed, resulting in identification of the three lncRNA signature, composed of AC093850.2, LINC00460 and RP11-366H4.1.1, which was verified in test set (
n = 61) and in an independent cohort (
n = 119) (Fig.
3). In a previous study, we reported that lncRNA625 (RP11-625H11.2.1) is associated with the prognosis of OS and DFS for patients with stage III esophageal cancer and with lymph node metastasis. In the present study, without considering lymph node metastasis, we analyzed the association of lncRNA625 with prognosis of OS and DFS for patients in the training set (
n = 77). The results showed that there was no association with prognosis (Fig.
2), suggesting that lncRNA625 is connected with a particular stage of ESCC. As for AC093850.2, LINC00460 and RP11-366H4.1.1, the expression of the three lncRNAs was associated with the prognosis of OS and DFS for patients in the training set (
n = 77) (Fig.
2). Therefore, it is inappropriate to connect lncRNA625 with AC093850.2, LINC00460 and RP11-366H4.1.1, as a prognostic biomarker signature for OS and DFS of patients with esophageal cancer.
Our studies were distinct from the data published by Li et al. [
6]. The main difference between our studies and Li’s was the difference of the algorithms. Based on the lncRNA expression profile microarray, Li et al. adopted a random Forest supervised classification algorithm and a nearest shrunken centroid algorithm and revealed a three-lncRNA signature associated with the survival of esophageal cancer patients. In our previous study, a random walk algorithm was run in a lncRNA-PCG co-expression network constructed using differentially- expressed lncRNAs and known protein coding genes in ESCC. Thus, each lncRNA in the network would be given an URWScore value and lncRNAs with the higher URWScore possess more important biological functions in ESCC [
16]. The selected lncRNAs with higher URWScore may be associated with cancer cell proliferation, metastasis, differentiation, angiogenesis and survival time of patient with esophageal cancer. Another difference between our studies with Li’s was that in this study, we reported a novel lncRNA signature for prognostic diagnosis for OS and DFS for patients with esophageal cancer by the comparison of the levels of lncRNAs in paired ESCC samples and adjacent normal tissues, suggesting more potential lncRNAs playing the critical roles in cancer cells.
In conclusion, the three-lncRNA signature is a significant predictor of OS and DFS. This finding might help doctors to individualize prognosis and recurrence. However, the power of the predicted signature needs additional studies involving larger populations for further validation.
Acknowledgements
We thank technician Guizhou She for performing siRNA transfection assays.