Main

Colorectal adenocarcinoma (CAC) is one of the leading causes of cancer-related death with increasing incidence and mortality in the past several decades. Approximately 1.23 million people worldwide develop CAC annually, leading to 0.6 million deaths (Jemal et al, 2009; Ferlay et al, 2010). However, its current high mortality and most CAC-related deaths would be significantly decreased if the early-stage CAC and precancerous lesions could be detected by more suitable diagnostic tools (Lieberman, 2009). Among the available CAC testing, colonoscopy is recommended as the most reliable tool for high-risk people. However, its wide application in large-scale screening is hampered owing to its invasive nature and high cost (Kim et al, 2007). In addition, the faecal occult blood test (FOBT) and carcinoembryonic antigen (CEA) detection have relatively poor sensitivity (Collins et al, 2005; Levin et al, 2008; Mandel, 2008). None of these methods have been established as a well-accepted diagnostic tool, particularly for the early stage of CAC. Therefore, there is an urgent need to identify new biomarkers with high sensitivity and specificity for the early-stage CAC and precancerous lesions.

MicroRNAs (miRNAs) are a class of small non-coding RNAs with 19–24 nucleotides, and they have attracted a great deal of attention in cancer research. Functional studies have indicated that deregulation of miRNAs is involved in the initiation and progression of human cancer (Garzon et al, 2006; Cho, 2007). Recently, serum miRNAs have been shown to be stable, and they could be excellent candidates for the early diagnosis of cancer (Zhang et al, 2010; Liu et al, 2011; Chen et al, 2012). Several studies have examined the alteration of serum miRNA expression in colorectal cancer patients compared with healthy controls (Chen et al, 2008; Ng et al, 2009; Huang et al, 2010; Pu et al, 2010). Additional studies have examined the expression of certain selected miRNAs in colorectal adenoma (CA), a usual precursor to CAC (Asaga et al, 2011; Oberg et al, 2011). However, those studies were limited by one or more of the following factors: limited numbers of screened miRNAs, small sample size, lack of independent validation and failure to differentiate CA from CAC and healthy controls.

In the present study, we characterised the genome-wide miRNA expression profile in the serum of patients with CAC and CA and of healthy controls using a high-throughput Miseq sequencing screening followed by two phases of reverse transcription PCR (RT–qPCR) assay. Finally, we developed a 4-miRNA panel using logistic regression model and provided useful serum biomarkers with high diagnostic accuracy for the early diagnosis of CAC.

Materials and methods

Study design

In the present investigation, our experimental process was separated into three phases (Figure 1). In the initial screening phase, serum samples were collected from 30 CAC patients, 25 CA patients and 30 healthy controls, and differentially expressed miRNAs were assessed by Miseq sequencing using the intersection of two pairwise comparisons: CAC vs CA and CA vs healthy controls. In the training phase, miRNAs identified via Miseq sequencing were first evaluated by RT–qPCR in an independent cohort of serum samples from 80 participants, including 40 CAC patients, 16 CA patients and 24 healthy controls. Subsequently, the validated miRNAs, which were differentially expressed between CAC and control group (CA and healthy controls), were further examined in additional 240 participants, including 120 CAC patients, 50 CA patients and 70 healthy controls. The diagnostic miRNA panel was then constructed using the above-mentioned 320 serum samples based on the logistic regression model. In the validation phase, parameters of the logistic model obtained from the training phase were applied to another independent cohort of 292 participants (117 CAC patients, 73 CA patients and 102 healthy controls) to validate the diagnostic performance of the miRNA panel. This study was approved by the Clinical Research Ethics Committee of Qilu Hospital of Shandong University.

Figure 1
figure 1

Overview of the design strategy.

Patients and control subjects

Written informed consent was obtained from every participant for the use of venous blood sample. All the CAC and CA patients were recruited from the Department of General Surgery and Gastroenterology, Qilu Hospital of Shandong University between 2010 and 2013. The healthy controls were enrolled from the Department of Physical Examination Center, Qilu Hospital of Shandong University.

All blood samples were collected before any therapeutic procedures, such as surgery, chemotherapy and radiotherapy, were performed. CAC and CA diagnosis was confirmed by histopathology or biopsy. Tumours were staged according to the tumour–node–metastasis (TNM) staging system of the Union for International Cancer Control (UICC). Age- and sex-matched healthy controls were recruited from a large pool of healthy individuals seeking a routine health check-up (Table 1).

Table 1 Clinicopathological data for patients and healthy controls

Serum preparation

Briefly, 5 ml of venous blood was collected from each subject. The whole blood was separated into serum and cellular fractions within 2 h by centrifugation at 4000 r.p.m for 10 min. The supernatant (serum) was collected and further centrifuged at 12 000 r.p.m for 15 min to completely remove the cell debris. The whole process was strictly controlled to avoid haemolysis. The obtained serum was stored at −80 °C before further analysis. The CEA level of each sample was determined using electrochemiluminescence immunoassay.

Miseq sequencing

For Miseq sequencing, equal volumes of serum from age- and sex-matched subjects, including 30 CAC patients, 25 CA patients and 30 healthy controls, were pooled. The sequencing procedure was conducted as described in our previous study (Zheng et al, 2013).

Quantification of miRNAs by RT–qPCR analysis

RT–qPCR was performed on an ABI PRISM 7500 Sequence Detection System (Applied biosystems, Carlsbad, CA, USA) using the SYBR PrimeScript miRNA qPCR Kit (Takara Bio, Otsu, Japan). The reverse transcription was conducted in a 20-μl reaction system containing 10 μl of 2 × miRNA reaction buffer mix, 2 μl of miRNA Primescript RT Enzyme Mix (Takara Bio), 2 μl of 0.1% BSA and 3 μl of serum pre-mixed with 3 μl of serum buffer (2.5% Tween-20, 50 mmol l−1 Tris and 1 mmol l−1 EDTA). The reaction was performed at 37 °C for 60 min, followed by 85 °C for 5 s. The synthesised cDNA was centrifuged at 10 000 r.p.m for 10 min. The PCR was conducted in a 25-μl reaction system containing 12.5 μl of SYBR Premix Ex Taq (Takara Bio), 0.5 μl of dye, 2 μl of 5 μM forward primer, 1 μl of 10 μM Uni-miR qPCR Primer (Takara Bio), 7 μl of ddH2O and 2.0 μl of cDNA template. Briefly, after a denaturation step at 95 °C for 30 s, the amplification was carried out for 45 cycles at a melting temperature of 95 °C for 5 s and an annealing temperature of 57 °C for 34 s. A dissociation curve was analysed for each PCR experiment to assess the primer–dimer formation or contamination. All reactions were performed in triplicate, and the Cq values were determined using the default threshold setting. The combination of miR-191-5p and U6 snRNA was used as reference genes for serum miRNAs RT–qPCR detection as described in our previous study (Zheng et al, 2013).

Statistical analysis

All statistical analyses were performed with SPSS 17.0, and non-parametric Mann–Whitney U-test was used to compare the difference in serum miRNA concentrations between the cancer group and the control group. A P-value of <0.05 was considered as statistically significant. Multiple logistic regression analysis was used to establish the miRNA panel. Receiver operating characteristic (ROC) curves was constructed, and area under the ROC curve (AUC) was used to evaluate the diagnostic performance of the selected miRNA panel. MedCalc software version 9.3.9.0 (MedCalc, Mariakerke, Belgium) was used to perform the ROC curve.

Results

Discovery of candidate biomarkers by Miseq sequencing

Of the 740 sequenced serum miRNAs, 348, 303 and 283 miRNAs were detectable (>10 copies) in CAC patients, CA patients and healthy controls, respectively. During the Miseq sequencing, the final reads of each miRNA in each pooled group were determined through the normalisation with total reads of all called miRNAs. The miRNAs were selected as candidate biomarkers based on the following the criteria: (a) having at least 50 copies in any group; (b) exhibiting at least 10-fold altered expression; and (c) the intersection of CAC vs CA and CA vs healthy controls. Levels of 12 miRNAs, including miR-195-5p, miR-92a-3p, miR-1290, miR-582-5p, miR-223-3p and miR-136-5p, miR-3074-5p, miR-29a-3p, miR-221-3p, miR-148a-3p, miR-19a-3p and miR-17-3p were significantly higher in CAC than those in CA and healthy controls (fold change=10.78–399.02; P<0.05). In contrast, levels of 3 miRNAs, including miR-422a, miR-1260a and miR-4502, in the CAC group were significantly lower (fold change=0.002–0.09; P<0.05). In summary, a total of 15 differentially expressed miRNAs were identified as candidate biomarkers, which should be further tested via RT–qPCR.

Confirmation of miRNAs by RT–qPCR analysis

We first tested the 15 candidate miRNAs using an independent cohort of 80 serum samples with RT–qPCR. Four miRNAs (miR-136-5p, miR-195-5p, miR-221-3p and miR-582-5p) with detection rate of <75% and two miRNAs (miR-148-3p and miR-3074-5p) with Cq value of >35 were excluded from further analysis, resulting in nine qualified candidate miRNAs. In addition, five miRNAs (miR-17-3p, miR-29a-3p, miR-1260a, miR-1290 and miR-4502) with a P-value of >0.05 were excluded. Consequently, only four miRNAs (miR-19a-3p, miR-223-3p, miR-92a-3p and miR-422a) with a differential expression between the CAC vs CA and CA vs healthy control group were selected as candidate biomarkers.

These four miRNAs were further evaluated by RT–qPCR using additional independent 240 serum samples. Figure 2 shows that high expression levels of miR-19a-3p, miR-92a-3p and miR-223-3p as well as the low expression level of miR-422a were detected in CAC patients compared with CA and healthy control group. Moreover, the levels of these four miRNAs in CAC of different clinical stages and CA of different grades were compared as shown in Figure 3 and 4.

Figure 2
figure 2

Large-scale validation of miR-19a-3p ( A ), miR-92a-3p ( B ), miR-223-3p ( C ) and miR-422a ( D ) in 94 healthy controls, 66 colorectal adenoma (CA) patients and 160 CAC patients. Expression levels of the microRNAs (miRNAs) (Log10 scale at Y-axis) are normalised to the combination of miR-191-5p and U6. Mann–Whitney U-test was used to determine the statistical differentiation.

Figure 3
figure 3

The expression of miR-19a-3p ( A ), miR-92a-3p ( B ), miR-223-3p ( C ) and miR-422a ( D ) in CAC of different clinical stages. *P<0.05, **P<0.01.

Figure 4
figure 4

The expression of four miRNAs in the CA group of different grades. There was no significant statistical difference for miR-19a-3p, miR-92a-3p, miR-223-3p and miR-422a expression between the low-grade and high-grade intraepithelial neoplasia groups.

The diagnostic accuracy of these four miRNAs was measured by ROC and their corresponding AUCs were 0.849, 0.871, 0.890 and 0.843, respectively (Figure 5A–D).

Figure 5
figure 5

Receiver operating characteristic (ROC) curves for the ability of the four individual miRNAs ( A D ) and the 4-miRNA panel ( E ) to differentiate the CAC patients from the control group in the training phase. Comparison of ROC curves for the ability of the 4-miRNA panel and carcinoembryonic antigen (CEA) (F) to differentiate the CAC patients from the control group in the validation phase.

Establishing the predictive miRNAs panel

In the present study, we constructed the miRNA panel for CAC diagnosis using 320 serum samples as the training data. The predicted probability of diagnosis with CAC from the stepwise logistic regression model was calculated using the equation as follows: logit(P)=0.3313-0.0081 × miR-19a-3p-0.0257 × miR-92a-3p-0.0406 × miR-223-3p+0.1328 × miR-422a.

The diagnostic performance of the established miRNA panel was evaluated by the ROC analysis. Figure 5E shows that the AUC of the established 4-miRNA panel was 0.960.

Validation of the miRNAs panel

We further validated the diagnostic performance of the established 4-miRNA panel in another independent validation phase. The AUC of the 4- miRNA panel was 0.951 (95% CI: 0.907–0.978; sensitivity=84.3%, specificity=91.6%), which was >CEA detection (AUC: 0.667, 95% CI: 0.593–0.735, P<0.001, Figure 5F).

Moreover, we further evaluated the diagnostic performance of the established miRNA panel at different TNM stages. The corresponding AUCs for patients with TNM stages I, II, III and IV were 0.942, 0.935, 0.954 and 0.983, respectively (Figure 6A–D).

Figure 6
figure 6

ROC curves for the ability of the miRNA panel to differentiate the stages I ( A ), II ( B ), III ( C ), IV ( D ), low-CEA level group ( E ) and high-CEA level group ( F ) of CAC from the control group, CAC group from CA group ( G ), CA group from healthy control group ( H ) in the validation phase.

In addition, we then evaluated the diagnostic accuracy of the established miRNA panel according to the CEA level. In the low-CEA level (<5 ng ml−1) group, the AUC of the established miRNA panel was 0.810 (95% CI, 0.725–0.818; sensitivity=57.14%, specificity=86.67%; Figure 6E). In the elevated CEA level (>5 ng ml−1) group, the AUC of the established miRNA panel was 0.918 (95% CI, 0.861–0.957; sensitivity=76.79% and specificity=85.56%; Figure 6F).

Finally, we also evaluated the diagnostic performance of the 4-miRNA panel in discriminating the CA from CAC and healthy controls group. The analysis demonstrated that the miRNA panel possessed a high accuracy in discriminating CA from CAC (AUC=0.886; 95% CI: 0.809–0.940; Figure 6G) and CA from healthy controls (AUC=0.765; 95% CI: 0.669–0.845; Figure 6H).

Discussion

Current methods for the CAC diagnosis have been used for many years, including FOBT, CEA detection and colonoscopy. However, the diagnostic performance of these methods is unsatisfactory, especially for the diagnosis of early-stage CAC (Collins et al, 2005; Kim et al, 2007; Lieberman, 2009). The lack of effective detection measures in the early stage at least partly contributes to the high mortality of CAC. The recent discovery of serum miRNA profile in human cancer has provided a new auxiliary approach for the tumour diagnosis (Zhang et al, 2010; Zhou et al, 2011; Liu et al, 2012). Our study is the genome-wide analysis of serum miRNAs during the normal–CA–CAC sequence.

In this study, we established a proof of principle approach to identify a serum miRNA profile of CAC patients. As an initial screening stage, we performed a high-throughput Miseq sequencing assay and excluded the possible contaminations by other small RNAs. However, the sequencing results from pooled samples might not be consistent with the RT–qPCR results performed on individual serum samples. Therefore, the screening stage was followed by two phases of RT–qPCR validation in our study. Through this approach, four significantly altered miRNAs (miR-19a-3p, miR-223-3p, miR-92a-3p and miR-422a) in the serum were identified. Our results demonstrated that the 4-miRNA panel possessed a high accuracy in the CAC diagnosis. Surprisingly, this panel had the potential to separate stage I/II CAC patients from controls and could predict CAC at a relative early stage. Furthermore, we also demonstrated that the 4-miRNA panel was a more sensitive indicator of CAC than the conventional CEA biomarker. Even in the low-CEA level group, the diagnostic accuracy of the miRNA panel was still acceptable. In addition, we also evaluated the diagnostic performance of the 4-miRNA panel in discriminating the CA group from the CAC and healthy control group, and satisfactory results were obtained.

Previous studies have identified a spectrum of dysregulated miRNAs associated with the tumorigenesis and development of CAC. However, these studies mainly focused on the miRNAs in tissues or cells (Slaby et al, 2007; Schetter et al, 2008; Chen et al, 2009; Diosdado et al, 2009; Motoyama et al, 2009; Earle et al, 2010; Liu et al, 2010; Chiang et al, 2011). The reliance on surgical section and invasive procedure for tissue sample collection limits the application of tissue miRNAs in cancer diagnosis (Liu et al, 2012). It becomes possible to comprehensively analyse cancers through a serum miRNA-based biomarker without the need for biopsy, surgery or other invasive methods. In fact, the differential expression of several circulating miRNAs has been reported, including miR-92a, miR-29a, miR-17-3p and miR-221 (Ng et al, 2009; Huang et al, 2010; Pu et al, 2010; Faltejskova et al, 2012; Liu et al, 2013). Previous studies showed that miR-92a is significantly increased in colorectal cancer plasma compared with healthy controls, suggesting its possibility of serving as a potential CAC biomarker. However, its expression levels are not correlated with TNM stages. MiR-92a can also discriminate advanced adenoma from normal controls but fails to distinguish CAC from CA (Ng et al, 2009; Liu et al, 2013). However, one study reported that no significant difference was observed in miR-92a levels in the sera of CRC patients and controls, which provided an evidence against the usage of serum miR-92a as a new biomarker for early detection of CRC (Faltejskova et al, 2012). In the present study, we found that miR-92a-3p was upregulated in both Miseq sequencing and RT–qPCR results and correlated with TNM stages. It could distinguish not only CAC from healthy controls (AUC: 0.918, see ), CA from healthy controls (AUC: 0.826, See Supplementary Figure 1B) but also CAC from CA (AUC: 0.762, see Supplementary Figure 1C). We did not further validate miR-92a-5p by RT–qPCR because it was <50 copies in all the three groups according to our sequencing results. Moreover, some studies showed that plasma levels of miR-29a and miR-17-3p were significantly increased in CAC patients compared with healthy controls (Ng et al, 2009; Huang et al, 2010; Pu et al, 2010). One study reported that no significant differences were observed in miR-17-3p and miR-29a levels in the sera of CRC patients and controls, which was contradictory to previous studies (Faltejskova et al, 2012). In the present study, we found miR-17-3p and miR-29a were not gradually upregulated in the healthy control–CA–CAC sequence in the selection phase. Although it has been found that miR-19a (Wang et al, 2010) and miR-223 (Fu et al, 2012; Wu et al, 2012) are upregulated, and that miR-422a (Faltejskova et al, 2012) is downregulated in CAC tissues compared with normal mucosa, our study was the first report, to our knowledge, showing the diagnostic value of serum expression of miR-19a-3p, miR-223-3p and miR-422a in CAC patients.

Compared with those studies of circulating miRNAs in CAC diagnosis, our study was unique owing to reasons as follows. First, instead of measuring several miRNAs selected from literatures, we screened the genome-wide serum miRNA profiles of CAC via Miseq sequencing. Miseq sequencing is a recent introduction of next-generation sequencing technology, and it can measure the absolute abundance of miRNAs, resulting in a better chance to identify potential diagnostic markers. Furthermore, we included not only the CAC and healthy controls, but also the CA group. It is well known that CA is the usual benign precursor lesion in the transformation to CAC (Oberg et al, 2011). The clinical pathway of most CAC may follow the healthy, CA and CAC states. Therefore, the intersection of differentially expressed miRNAs from CAC vs CA and CA vs healthy control should be considered. Failure to do so might be the reason of the unable discrimination of CA from CAC and healthy controls in previous studies. In addition, we validated the newly developed serum miRNA panel by a large independent cohort using a combination of miR-191-5p and U6 snRNA as reference genes. Currently, there is no standard reference gene for the circulating miRNA studies. In our previous studies, we profiled pooled serum of CAC, CA and healthy controls (same samples used in this study) followed by two phases of validation. We found that there is no differential expression of the combination of miR-191-5p and U6 snRNA among the three groups. Therefore, they could be used as reference genes for serum miRNA RT–qPCR study in CAC (Zheng et al, 2013).

Taken together, we established a serum 4-miRNA panel using a large number of individuals, and this serum miRNA panel could differentiate CAC from CA and healthy controls with a high accuracy. Our study also demonstrated that the 4-miRNA panel had important clinical value for the diagnosis of stage I/II CAC and CA patients. In conclusion, our finding provided a relatively accurate CAC diagnosis at the early stage, and more patients, who would have otherwise missed the curative treatment window, might benefit from the early diagnosis.