Background
Colorectal cancer (CRC) poses a significant threat to the health of global populations; it is the second most commonly diagnosed cancer in females and the third in males [
1]. CRC develops in a progressive fashion during which normal colon epithelial cells transform to form benign growths such as polyps. These polyps may then progress to benign adenomas, and ultimately to invasive cancer lesions. The progression of the cancer has also been associated with sequential genetic changes in genes such as
K-RAS, APC, DCC, and
P53 [
2]. However CRC is a heterogeneous disease with various patient-related confounding factors such as the anatomic location of the tumour, race/ethnicity of the patient, and genetic and dietary interactions influencing the development of the disease [
3].
Screening at risk populations for CRC has significantly improved the outcome for patients, for instance diagnosis while the disease remains localised to the colon dramatically improves patient survival, and removal of early lesions such as adenomatous polyps may prevent disease formation [
4]. There are currently several potential screening tests available to detect CRC including the faecal occult blood test (FOBT), flexible sigmoidoscopy (FS), optical colonoscopy (OC) and computed tomography colonography (CTC). FOBT is a simple, cheap and safe test that relies on the assumption that large adenomas and cancerous lesions may bleed, and that these blood products are detectable in the faecal matter of patients. Although cheap and non-invasive, this test is vulnerable to false positive and negative results due to incorrect sample storage, or confounding medical complaints such as haemorrhoids. The other examinations involve more costly and invasive procedures which although allow direct access to colorectal lesions also suffer from low patient acceptance and procedural risks such as perforation of the colon [
4].
The focus of the scientific community has thus shifted to exploring the identification of non-invasive biomarkers of disease from bio-fluids such as saliva, urine, and blood. MicroRNAs (miRNAs) are nucleic acid markers that have been recently investigated in this context. MiRNAs are short (20-22nt) non-coding RNAs that negatively regulate gene expression through either mRNA degradation or translational repression [
5]. MiRNA expression has been shown to be altered in cancerous tissue compared to normal tissue and different miRNAs have been attributed oncogenic and tumour suppressor qualities [
6]. In 2008, Chen
et al. detected miRNAs in the serum and plasma blood components of humans and other animals. This primary study illustrated that miRNAs remain stable in serum after being subject to severe conditions such as extremely low or high pH, 10 freeze-thaw cycles, extended storage, boiling, and RNase digestion [
7]. In addition to their presence in serum and plasma, miRNAs have also been detected in other body fluids such as urine, saliva, and amniotic fluid making them ideal potential candidates as non-invasive biomarkers of disease [
8].
Expression levels of circulating miRNAs have shown some potential at distinguishing cancer patients and healthy controls for prostate [
9], ovarian [
10], lung [
11,
12], and breast cancers [
13]. Several studies have also investigated circulating miRNA levels for the detection of CRC. Initial approaches analysed small numbers of circulating miRNAs in CRC patient samples compared to normal controls [
14]. Other groups performed miRNA profiling on pooled plasma samples and validated candidate biomarkers on additional individual samples [
15], and others performed profiling on a small number of CRC tissue/serum/plasma samples before validation in a larger sample set [
16]. These studies have produced conflicting results [
17] and so recently, groups have begun to perform profiling on larger sample sets and included plasma from patients with adenomas in addition to CRC to improve the specificity of disease detection [
18].
In 2008, a guideline was released from the American Cancer Society which highlighted the importance for patients to have access to screening tests that will facilitate cancer prevention through the early detection of cancer, and the detection and removal of polyps [
4]. A clear deficit in the search for circulating biomarkers for the early detection of CRC to date is the lack of adenomatous polyp samples and the lack of separation of advanced and early stage cancers represented in studies [
14-
16,
18-
20]. The aim of this study was therefore to investigate the potential of circulating cell-free miRNAs not only as biomarkers of CRC, but also their efficiency at delineating patients presenting with polyps and benign adenomas from normal and cancer groups. To facilitate this we performed miRNA profiling for 667 miRNAs on a discovery set of 48 plasma samples comprising 8 normal, 8 polyp, 16 adenoma samples, 8 early stage cancer samples (stage I/II), and 8 advanced cancer samples (stage III/IV). Three candidate miRNAs; miR-34a, miR-150, and miR-923 were then further examined in a validation cohort of 97 independent plasma samples comprising 20 normal, 20 polyp, 20 adenoma samples, 23 early stage cancer samples, and 14 advanced cancer samples. In addition, we confirmed the altered expression of two of the miRNAs in an independent dataset of 40 CRC samples and their paired normal tissues. We found circulating levels of miR-34a and miR-150 to be capable of distinguishing patients groups with benign and malignant diseases of the colon from each other, and sets of miRNAs that distinguish patients with advanced cancer from benign disease groups. Specifically, we found high levels of circulating miR-34a and low miR-150 levels to distinguish patients with polyps from those with advanced cancer, and low circulating miR-150 levels to separate patients with adenomas from those with advanced cancer.
Methods
Patients selection and sample collection
Cases with positive colonoscopy results for malignancy, confirmed by histology as colon or rectal carcinomas, were recruited between December 2007 and December 2010 at the Department of Surgery, Adelaide and Meath Hospital and at the Thomayer Hospital in Prague, Czech Republic. Control subjects or subjects diagnosed with polyps or adenomatous polyps were selected during the same period from individuals undergoing colonoscopy for various gastrointestinal complaints (macroscopic bleeding, positive faecal occult blood test or abdominal pain of unknown origin). The participating subjects gave written informed consent in accordance with the Declaration of Helsinki at the precipitating site that was approved by Tallaght Hospital/St. James’s Hospital Joint Research Ethics Committee, The Adelaide and Meath Hospital, Dublin, Incorporating The National Children’s Hospital, Tallaght, Dublin 24, Ireland and the Ethical Committee of the Institute of Experimental Medicine, Prague, Czech Republic. See Table
1 for clinical information on samples used.
Table 1
Clinical information on the discovery and validation plasma sample cohorts
Normal
| 8 (4/4) | 67 ± 11 |
Polyps
| 8 (4/4) | 65 ± 7 |
Adenoma
| 16 (8/8) | 56 ± 6 |
Early Stage Cancer (Stage I/II)
| 8 (4/4) | 65 ± 10 |
Advanced Cancer (Stage III/IV)
| 8 (4/4) | 68 ± 8 |
Validation Cohort
|
|
n (M/F)
|
Age
|
Normal
| 20 (12/8) | 63 ± 8 |
Polyps
| 20 (11/9) | 57 ± 7 |
Adenoma
| 20 (12/8) | 62 ± 10 |
Early Stage Cancer (Stage I/II)
| 23 (10/13) | 63 ± 12 |
Advanced Cancer (Stage III/IV)
| 14 (9/5) | 67 ± 8 |
Two separate patient cohorts were identified, a discovery set (n = 48) comprising 8 normal, 8 polyp, 16 adenoma samples, 8 early stage cancer samples (stage I/II), and 8 advanced cancer samples (stage III/IV), and a validation set (n = 97) comprising 20 normal, 20 polyp, 20 adenoma samples, 23 early stage cancer samples, and 14 advanced cancer samples. In addition, an independent public dataset [
21] of quantitative real-time PCR (qRT-PCR) raw data was downloaded from the NCBI GEO archive (accession no: GSE28364) which contains information on 40 CRC samples and their paired normal tissues.
Plasma samples were collected according to standard phlebotomy procedures. 10 ml of blood sample was collected into EDTA plasma tubes and immediately placed in ice. The tubes were centrifuged at 1000 x g for 10 minutes at 4°C. Plasma was denuded by pipette from the cellular material, aliquoted into cryovial tubes, labelled and stored at -80°C until the time of analysis. The time from sample procurement to storage at -80°C was less than 3 hours. Each plasma sample underwent no more than 3 freeze/thaw cycles prior to analysis.
Total RNA was isolated from 60 μl of each plasma sample using the miRNeasy mini kit (Cat no 217004, Qiagen). The Qiagen supplementary protocol (Purification of total RNA, including small RNAs, from serum or plasma) was utilised with the following modifications: thawed plasma samples were centrifuged at 1000 x g for 5 minutes at 4°C to remove excess debris from samples, RNA was extracted from the upper 50 μl of each sample. To elute the RNA, 50 μl of nuclease-free water was added to each spin column and incubated for 1 minute at room temperature before centrifuging into non-stick RNase-free microfuge tubes (Cat no AM12350, Ambion) to elute the RNA.
MiRNA profiling of plasma with TaqMan® low-density arrays
TaqMan® Array Human MicroRNA A and B Cards v2.0 (Cat no 4400238, Applied Biosystems) were employed to examine the expression of 667 miRNAs in 48 plasma samples in the discovery cohort. Reverse transcription and quantitative PCR (qPCR) were performed on equal volumes of RNA from each sample according to the manufacturer’s instructions using TaqMan® MicroRNA Reverse Transcription Kit (Cat no 4366596, Applied Biosystems) and Megaplex RT Primers to convert the miRNAs to cDNA, TaqMan® PreAmp Master Mix (Cat no 4391128, Applied Biosystems) and Megaplex PreAmp Primers for a preamplification step before real-time analysis. qPCR was performed using TaqMan® Universal Master Mix II, no UNG (Cat no 4440048, Applied Biosystems) on the 7900HT Fast Real-Time PCR system (Applied Biosystems). The Sequence Detector System software version 2.2.2 was utilised to generate study files using a fixed threshold value of 0.1 for statistical analysis (accession no: GSE67075).
Validation of miRNA expression using qRT-PCR
Individual TaqMan® miRNA assays were used for miRNA quantification in the 97 plasma samples in the validation cohort. To improve reverse transcription efficiency a miRNA multiplex RT primer pool was made from the singleplex RT primers of the four miRNAs to be analysed; miR-34a, miR-150, miR-923, and miR-let7e (this miRNA was used as the endogenous control as it showed very little variation in the discovery cohort, ΔCt SD = 0.865). 100 μl of each 20X RT primer were added to an RNase-free microfuge tube. The tube was dried in a speed vacuum (MAXI dry plus, Medical Supply Company, Ireland) at 50°C for 1 hour. The primers were re-suspended in 100 μl of nuclease-free water and 300 μl of 0.1X TE buffer was added to yield a 5X multiplex RT primer pool. The TaqMan® MicroRNA Reverse Transcription Kit (Cat no 4366596, Applied Biosystems) was used to perform reverse transcription reactions. Each reaction contained 1.8 μl of RT buffer (10X), 0.18 μl of dNTPs (25 mM), 3.6 μl of miRNA multiplex RT primer pool (5X), 1.2 μl of Multiscribe RT enzyme (50 U/μl), 5.22 μl of nuclease-free water and 6 μl of extracted total RNA. The reactions were incubated at 16°C for 30 minutes, 42°C for 30 minutes and 85°C for 5 minutes (G-STORM, GS1, Somerton Biotechnology Centre, UK).
Real-time PCR analysis was performed on 96 well plates (Cat no 4346906, Applied Biosystems). Technical triplicate PCRs were performed for each sample, and no template controls and a pooled sample containing cDNA from all 97 samples were included on each plate to ensure inter-plate reproducibility. Each reaction contained 1 μl of TaqMan miRNA assay (20X), 10 μl of TaqMan® Universal Master Mix II, no UNG (Cat no 4440048, Applied Biosystems), 7.67 μl of nuclease-free water, and 1.33 μl of cDNA. The reactions were incubated at 95°C for 10 minutes, and 40 cycles of 95°C for 15 seconds and 60°C for 15 seconds on the 7900HT Fast Real-Time PCR system (Applied Biosystems). The Sequence Detector System software version 2.2.2 was utilised to generate study files using a fixed threshold value of 0.1 for statistical analysis.
Statistical analysis
In the discovery cohort (n = 48), each miRNA was normalised by the ΔΔC
t method using the average within sample C
t value [
22]. This technique involves the use of the mean expression value of all expressed microRNAs in a given sample as a normalisation factor for microRNA real-time quantitative PCR data. Thus the average within sample C
t value for each card is calculated by averaging all miRNA Ct values for each individual sample. This was performed using the Bioconductor package HTqPCR (
www.bioconductor.org). The non-parametric Kruskal-Wallis test was used to determine between group variations by rank as the data was not normally distributed. A Wilcoxon rank sum test was subsequently used to perform pair-wise comparisons between the 5 groups for the significant miRNAs identified by the Kruskal-Wallis test.
As an alternative to spiking un-related miRNA constructs into our samples we utilised the miRNA profiling data of the discovery cohort of samples to choose an appropriate endogenous control for use in the validation cohort. This involved analysing the expression of all 667 miRNAs across all 48 samples in the discovery cohort allowing us to choose one of the least variant miRNAs. As MammU6 showed highly variant expression in the discovery cohort, miR-let7e was chosen for use as an endogenous control for the validation set as it was one of the least variant miRNAs in the discovery phase experiment (ΔC
t standard deviation of 0.86). When the let7e C
ts were examined across all samples in the validation cohort this miRNA proved an appropriate endogenous control with a C
t standard deviation of 1.64. Statistically significant differences were determined using the non-parametric Wilcoxon rank sum test. The p-values for the validation set were adjusted using the Benjamini and Hochberg method [
23] to account for multiple testing.
For consistency, the independent public dataset from Reid
et al. [
21] (accession no: GSE28364) was normalised using the same approach used to analyse the discovery cohort qRT-PCR data. This independent study used TaqMan® Array Human MicroRNA Cards v2.0 to analyse miRNA expression in 40 CRC tumour samples and their paired normal tissues. In order to mimic this structure in our validation plasma sample cohort, we grouped samples into ‘non-malignant’ and ‘malignant’ groups. As there were only two groups (normal versus cancer) in this analysis, the Wilcoxon rank sum test was used to determine significantly differentially regulated miRNAs. For this analysis of the validation cohort, miR-34a, miR-150 and miR-923 were first normalised against the endogenous control (miR-let7e) and the Wilcoxon rank sum test was used to determine significance between the groups.
Logistic regression (LR) and receiver operator characteristic (ROC) curve analysis were performed on miR-34a, miR-150 and miR-923 in the validation cohort. The markers were combined using LR and the ROC curves were used for interpretation of the models generated. The area under the curve (AUC) from the ROC curve for a given model was used to determine the probability of a correct prediction. The LR model for single miRNAs or combinations of miRNAs which gave the highest AUC was considered the most discriminating model and therefore the best marker at distinguishing between the groups of interest. All calculations were carried out in the R statistical environment (
http://cran.r-project.org/) using the HTqPCR and stats packages.
Discussion
Screening for the early detection of CRC is important to improve patient survival and facilitate cancer prevention through the detection and removal of polyps. The aim of this study therefore was to investigate the potential of circulating cell-free miRNAs not only as biomarkers of CRC, but also their efficiency at delineating patients presenting with precancerous lesions, i.e. polyps and benign adenomas from normal and cancer patient groups. MiRNA profiling was performed in the discovery sample cohort consisting of five groups; normal, polyps, benign adenomas, early stage cancer, and advanced cancer (Table
2) and identified three candidate miRNAs (miR-34a, miR-150, and miR-923) which were then further examined in a validation cohort of 97 samples divided into the same five groups as before (Figure
1 and Table
3). In addition, we confirmed that the altered circulating levels of miR-34a and miR-150 mirror the expression changes evaluated in the tumours of an independent dataset of 40 CRC samples and their paired normal tissues (Figure
2).
miR-34a is a p53-regulated miRNA that has been shown to influence both cellular senescence and apoptosis [
24,
25]. Different studies have demonstrated its up or down regulation in CRC compared to normal tissue (as reviewed in [
26]). Wu
et al. [
27] demonstrated the involvement of this miRNA in CRC invasion and metastasis through targeting FRA1, a FOS transcription factor that is capable of forming activator protein-1 (AP-1) heterodimers. Increased levels of this circulating miRNA have been detected in patients with chronic hepatitis C and non-alcoholic fatty liver disease [
28] and levels have been found to be decreased in whole blood samples of patients with CRC compared to healthy controls [
29]. Brunet and colleagues studied miRNA expression in stage III CRC tissue samples compared to normal controls and found miR-34a to be significantly up-regulated [
30].
In one of the first studies to investigate the altered expression of miRNAs in cancer [
31] miR-150 was shown to be up-regulated in colorectal tissue compared to normal tissue. However several subsequent studies have shown this miRNA to be down-regulated in CRC tissue compared to normal tissue [
21,
32]. Indeed a recent study on 239 samples from Ma
et al. found that miR-150 was down-regulated in adenoma and CRC tissues compared to normal tissue, and this down-regulation was associated with decreased overall survival and a worse response to adjuvant chemotherapy [
33]. Decreased circulating levels of miR-150 have been identified in patients with acute myeloid leukaemia [
34], and have been associated with poor prognosis for critically ill patients [
35]. Furthermore, Wang and colleagues found miR-150 expression to be down-regulated in their 10 pooled CRC plasma samples compared to 10 pooled control samples, although its altered expression was not validated in their additional individual samples [
15]. To our knowledge there are no studies outlining miR-923 expression in CRC or detecting circulating levels of this miRNA, however it has been shown to be down-regulated in chronic lymphocytic leukaemia patients [
36] and up-regulated in taxol resistant breast cancer cells [
37].
Of the three miRNAs analysed in our validation cohort, only miR-34a distinguished the normal group from the disease groups (Figure
1A). A large amount of inter-individual variability was noted in the normal samples assessed for miR-150 and miR-923 expression which may account for the fact that they do not significantly separate the normal group from the disease groups (Figure
1B&C). The reason for this variability may lie in the fact that these subjects had sufficient medical complaints to present themselves for colonoscopy, but although they do not present polyps, adenomas or cancer of the colon they may have had other conditions such as irritable bowel disease which may influence the results. In addition, the high number of adenoma samples (n = 16) compared to the other sample group numbers (n = 8) in the discovery cohort may explain why we observed statistically significant alterations in miR-150 and miR-923 in the initial analysis (Table
2) but not in the validation cohort (Table
3). Despite the variability within sample groups, we found circulating levels of miR-34a and miR-150 to be capable of distinguishing cancer patients from the non-malignant group of patients (Figure
2), in addition they were also capable of delineating patient groups with different diseases of the colon from each other (Table
3). Moreover, the discovery miRNA profiling results (Table
2) provide additional miRNA candidates, for instance miR-144-5p that may have potential as circulating miRNA biomarkers of CRC which can be exploited and independently validated by other research groups. In our opinion this is an important step towards the identification of specific biomarkers for early stages of disease.
There have been several publications examining the potential for miRNAs to act as circulating biomarkers for the detection of CRC. Recently, Faltejskova and colleagues attempted to validate the serum levels of four miRNAs (miR-17-3p, miR-29a, miR-92a and miR-135b) that had been proposed by other groups as potential circulating biomarkers of CRC. They used qPCR to assess the miRNA expression levels in 100 CRC patients and 30 healthy controls, and did not detect any significant changes in the expression of any of the miRNAs evaluated [
17]. We examined the lists of significantly differentially expressed miRNAs in our discovery cohort to determine whether we also identified biomarkers found by other groups. We did not find miR-21 [
19], miR-141 [
38], miR-29a, miR-17-3p or miR-92 [
14,
16], miR-601 or miR-760 [
15] to be significantly differentially expressed in any of the disease groups compared to our healthy controls. We did, however find miR-19a, miR-19b, and miR-15b significantly altered in some of our comparisons (see Table
2). These miRNAs were among those found by Giráldez
et al. to be significantly up-regulated in plasma samples of CRC patients in their study in 2012 [
18].
Although there is some concordance among the results of different groups in the search for biomarker miRNAs, uncertainty remains as to which miRNAs are the most appropriate markers of disease. Disparities in patient age and time of sample collection (i.e. before or after surgery/treatment) in different studies may impact on the reproducibility of results. In addition to these variables, it has been noted previously that miRNA profiles vary between different ethnic groups [
39], male and female patients [
40], and that blood cell contaminants can contribute to circulating miRNA profiles [
40,
41]. Blood cell contaminants of plasma and serum samples may be of particular importance in evaluating the potential of circulating miRNAs as biomarkers of disease. In fact, Pritchard and colleagues suggest that the elevated miR-92a levels detected in the plasma of patients with colon cancer are due the higher levels of red blood cell haemolysis in patients with this disease [
41]. This poses the question as to whether all of these putative biomarkers should be discarded due to their expression in blood cells, or whether extensive validation and perhaps additional profiling on larger more diverse patient cohorts will confirm the most reliable biomarker candidates. We would argue against discarding biomarkers because of their detection in hematopoietic cells, particularly if, as we have shown, their expression reflects that found within the tumour (Figure
2). If we were to discard markers for their presence in blood cells, we would also have to discount miR-21 as a valid marker of disease as it was also detected by Duttagupta and colleagues [
40]. This miRNA is commonly up-regulated in cancer, has been identified in the serum and stool samples of cancer patients, and multiple studies have linked its expression to advanced disease and worse outcome for patients [
42]. In an effort to control confounding factors in this study, all samples were age and sex matched, blood was taken at the time of colonoscopy before treatment commenced, and an additional centrifugation step to remove cellular debris prior to RNA extraction recommended by Duttagupta and colleagues [
40] was included in the sample processing.
In order to examine the diagnostic potential of our three candidate miRNAs in detecting different diseases of the colon we employed ROC curve analysis. To identify the most powerful candidate combinations we focused our analysis on comparing sample groups that showed significant differential expression of more than one of the three miRNAs. This approach allowed us to identify marker combinations which distinguish patient groups with benign disease of the colon from those with advanced stage cancer (Figure
3). Specifically, miR-34a and miR-150 abundance were capable of differentiating patients with polyps from those with advanced cancer, AUC = 0.904, and miR-150 abundance separates patients with adenomas from those with advanced cancer, AUC = 0.875. To further confirm the true diagnostic potential of these circulating cell-free miRNAs it is now important for these results to be independently replicated in additional samples by another group. If this independent validation were successful, a prospective validation of the miRNA candidate biomarkers would be warranted.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
STA performed the RNA extraction, miRNA profiling of plasma with TaqMan Low-Density Arrays, validation of miRNA expression using qRT-PCR, participated in analysis and interpretation of the results and prepared the manuscript. SFM performed all of the bioinformatic/statistical analysis. STA, SFM, DJH, BP, AC, ML, PD, PN, PD and MC contributed to the result interpretation and manuscript preparation. DJH, BP, AC, ML, PD, PN assisted in the collection and provision of clinical samples. PD and MC conceived the study, participated in its design, coordination and interpretation of the results and finalized the manuscript. All authors read and approved the final manuscript.