Background
Colorectal cancer (CRC) is regarded as one of the most frequent malignant tumours globally [
1]. This heterogeneous disease can develop through at least three distinct molecular pathways by which genetic and/or epigenetic dysregulation influences gene expression and protein levels finally leading to colorectal adenoma and carcinoma formation [
2,
3]. One of the epigenetic alterations that can contribute to CRC formation is the abnormal DNA hypermethylation of promoters, resulting in reduced or absent gene expression [
4]. DNA hypermethylation occurs at regulatory sites e.g. promoters in a tissue- and cancer type-specific manner [
5]. Besides genetic alterations, DNA hypermethylation of tumour suppressor genes is a frequently detected mechanism behind the inactivation of these genes leading to tumour initiation [
6]. Although more and more genes are associated with various types of cancers, our knowledge of DNA methylation markers in CRC development remains incomplete.
Another key posttrancriptional epigenetic regulator of gene expression, miRNA, regulates the stability and translation process of mRNAs. The expression of miRNAs has been shown to differ in colorectal tumours compared to healthy colon tissue specimens and on the basis of several experimental results they play role in colorectal cancer formation. Up- and downregulation of certain miRNAs was identified along the adenoma-carcinoma sequence of CRC and evidence supports the role of miRNAs in CRC development and progression as these small non-coding RNAs affect proliferation and invasion [
7].
The identification of genes affected by epigenetic changes can be achieved using whole genome gene expression analysis [
8]. DNA methylation and miRNA expression alterations can both lead to a certain degree of dowregulation of mRNA expression and consequently of protein levels, which can be confirmed by immunohistochemistry.
In the present study, our aims were (1) to identify DNA methylation markers in CRC samples on the basis of whole genome gene expression analysis and (2) to analyse the DNA methylation levels of these candidate marker along the colorectal adenoma-carcinoma sequence on colorectal adenoma and cancer samples. Furthermore, (3) our aim was to confirm the relationship between gene expression, DNA methylation status, miRNA expression and protein levels of the analysed candidate markers.
Methods
Selection of candidate genes regulated by DNA methylation
The selection of candidate genes was based on expression data generated from 147 colonic biopsy specimens (from 49 normal, 49 adenoma, and 49 CRC patients), laser capture microdissected colonic epithelial cells (from 6 NAT, 6 adenomas, and 6 CRC), analysed in a previous study by whole genome HGU133 Plus 2.0 microarrays (Affymetrix) [
8,
9]. These data files are available in the Gene Expession Omnibus database (
http://www.ncbi.nlm.nih.gov/geo/) at GSE series accession numbers GSE4183 (8 normal, 15 adenoma and 15 CRC), GSE10714 (3 normal, 5 adenoma and 7 CRC), GSE37364 (38 normal, 29 adenoma and 27 CRC)) and GSE15960 (laser microdissected colonic epithelial cells from 6 normal, 6 adenoma and 6 CRC).Clinical data of patients involved in the analysed gene expression studies can be found in Additional file
1: Table S1.
Although the bioinformatic analysis and the candidate selection was based on previously performed and published raw gene expression data of HGU133 Plus 2.0 microarrays, the aim of the present study was substantially different from the previously published studies’. We aimed to identify genes with gradually altering expression in adenoma and tumour samples that can be potentially regulated by DNA methylation. The data sets GSE4183 [
10], GSE10714 [
11], GSE 37364 [
9], and GSE15960 [
8] were analysed to identify genes potentially regulated by DNA methylation. Transcripts with gradually decreasing or increasing expression along the adenoma-carcinoma sequence were selected on the basis of Kendall (tau coefficient) rank correlation analysis (−0.5 ≤ tau coefficient ≤ 0.5). DNA methylation analysis was performed for genes with CpG island(s) on the basis of
in silico prediction by the CpG Plot EMBOSS application (
http://www.ebi.ac.uk/Tools/seqstats/emboss_cpgplot/) [
12].
Expression of the selected gene set was also analysed on gene expression data sets of human colorectal cell lines before and after DNA demethylation treatment with 5-Aza (GSE29060: 10 μM 5-Aza treatment for 72 h on HT-29 cell line; GSE14526: 3 μM 5-Aza treatment for 72 h on HCT116 and SW480 cell lines; GSE32323: 0.5 μM 5-Aza treatment for 72 h on Colo32, HCT116, HT-29, RKO and SW480 cell lines.
Student's t -test and Benjamini-Hochberg method were applied in order to determine significance of gene expression and DNA methylation level comparisons (p < 0.05). For logFc, abs (differences of average of intensity values) > 1 threshold was applied.
Tissue sample collection
For DNA methylation analysis, tissue specimens were obtained from surgically removed colon tumours (moderately differentiated, Dukes B-C stages; MSS) (
n = 15) and from histologically normal adjacent tissue (NAT) (
n = 15) derived from the furthest available area away from the tumour. In addition, adenomas (
n = 15) were also analysed, containing biopsy samples (
n = 10) and fresh frozen tissue samples (
n = 5), as well. Fresh frozen samples were snapfrozen in liquid nitrogen directly after surgery and were stored at −80 °C. Written informed consent was obtained from all patients; and the study was approved by the local ethics committee (Ethics Committee approval was obtained Nr.: TUKEB 2005/037 and TUKEB Nr.: 2008/69, Semmelweis University Regional and Institutional Committee of Science and Research Ethics, Budapest, Hungary). The study was performed according to the ethical standards of the revised version of Helsinki Declaration. Clinical data of patients involved in the study can be found in Additional file
2: Table S2.
Laser capture microdissection, macrodissection
Frozen tissue samples were embedded in OCT compound (Sakura Finetek, Japan). Then, 10 μm cryosections were cut at −20 °C in a cryostat instrument and mounted on PALM Membrane Slides 1.0 PEN (Carl Zeiss, Bernried, Germany). After fixation with 70 % ethanol for 5 min and absolute ethanol for 2 min, slides were stained with cresyl violet acetate (Sigma-Aldrich, St. Louis, USA). Colonic epithelial and stromal cells (approx. 103 cells) were collected using the PALM Microbeam laser capture microdissection system (PALM, Bernried, Germany). Macrodissected samples were collected from cryosections after toluidine blue staining. Selected areas containing both stromal and epithelial cells were harvested by scratching the tissue slide with a single-use needle.
DNA methylation analysis
Bisulfite conversion
Bisulfite conversion was performed using the EZ DNA Methylation Direct Kit (Zymo Research) without prior DNA isolation. Proteinase K digestion was performed in 20 μl (according to Section I Protocol A) followed by bisulfite conversion. The elution volume was 20 μl.
Bisulfite-specific PCR (BS-PCR)
In silico CpG island prediction was performed by CpG Plot EMBOSS Application (
http://www.ebi.ac.uk/Tools/seqstats/emboss_cpgplot/). Bisulfite-specific PCR reactions were performed using primers designed with PyroMark Assay Design software (SW 2.0, Qiagen, Hilden, Germany) to be specific for non-CpG regions in order to amplify the bisulfite converted DNA samples without discriminating between methylated and non-methylated sequences (Table
1). PCR primers in the opposite direction of sequencing primers were biotin labelled. Primer specificities were tested
in silico by BiSearch software (
http://bisearch.enzim.hu) [
13].
Table 1
Genes analysed in the study. Genes with gradually decreasing or increasing expression along the adenoma-carcinoma sequence with predictable CpG islands were selected on the basis of Kendall (tau coefficient) rank correlation analysis (−0.5 ≤ tau coefficient ≤ 0.5)
ALDH1A3 | aldehyde dehydrogenase 1 family, member A3 |
BCL2 | B-cell CLL/lymphoma 2 |
CDX1 | caudal type homeobox 1 |
COL1A2 | collagen, type I, alpha 2 |
CYP27B1 | cytochrome P450, family 27, subfamily B, polypeptide 1 |
ENTPD5 | ectonucleoside triphosphate diphosphohydrolase 5 |
FADS1 | fatty acid desaturase 1 |
MAL | mal, T-cell differentiation protein |
PRIMA1 | proline rich membrane anchor 1 |
PTGDR | prostaglandin D2 receptor (DP) |
PTGS2 | prostaglandin-endoperoxide synthase 2 |
SFRP1 | secreted frizzled-related protein 1 |
SFRP2 | secreted frizzled-related protein 2 |
SOCS3 | suppressor of cytokine signaling 3 |
SULF1 | sulfatase 1 |
SULT1A1 | sulfotransferase family, cytosolic, 1A, phenol-preferring, member 1 |
THBS2 | thrombospondin 2 |
TIMP1 | metallopeptidase inhibitor 1 |
BS-PCR reactions were performed using AmpliTaq Gold 360 mastermix (2x) (Life Technologies, Carlsbad, USA), LightCycler 480 ResoLight Dye (40x) (Roche Applied Science), primer mix (200 nM final concentration), bisulfite converted DNA samples (approx. 10 ng bcDNA/well) in 15 μl final volume. Real-time PCR amplification was carried out with the following thermocycling conditions on the LightCycler 480 System: 95 °C for 10 min, then 95 °C for 30 s, 60 °C with a 0.4 °C decrease/cycle for 30 s, 72 °C for 30 s for 10 touchdown cycles, followed by the amplification at 95 °C for 30 s, 56 °C for 30 s, and 72 °C for 30 s in 40 cycles.
Providing single-base resolution information about the methylation status of a CpG island direct sequencing is one of the most robust methods to analyse BS-PCR products. After bisulfite treatment and BS-PCR, all cytosines are converted to thymines except for those originally methylated. Two different pyrosequencing technologies were applied to analyse DNA methylation of BS-PCR products i.e. the Qiagen PyroMark System and the Roche GS Junior System utilising the 454 technology. The read length of the different technologies differs. With the PyroMark system sequences, up to 60 bp can be analysed, while up to 400 bp read length could be achieved with the 454 technology.
PyroMark Q24 sequencing
Pyrosequencing was performed on a PyroMark Q24 instrument (Qiagen) using PyroMark Gold Q24 Reagents (Qiagen) according to the manufacturer’s recommendations. Purification and subsequent processing of the biotinylated single-stranded DNA were performed in two consecutive runs by applying two different sequencing primers in order to cover more CpG sites in the amplicons [
14,
15]. Sequencing results were analysed using the PyroMark Q24 software v2.0.6 (Qiagen).
GS Junior sequencing
Library preparation with ligated adaptors and emulsion-PCR amplification were as described in “Guidelines for Amplicon Experimental Design”. The concentrations of BS-PCR amplicons were measured by Qubit fluorometer with High Sensitivity dsDNA reagent (Life Technologies). Amplicons belonging to the same sample were pooled at an equimolar ratio and PCR products were purified with AMPure beads (Agencourt, Beckman Coulter Genomics, Pasadena, USA) according to the manufacturer’s standard protocol. The Agilent Bioanalyzer was used with the High Sensitivity DNA Chip (Agilent, Santa Clara, USA) to assess sample quality. Fragment End Repair was performed using the GS FLX Titanium Rapid Library Preparation Kit (Rapid Library Preparation Method Manual 3.2). RL MID Adaptor Ligation was carried out using GS FLX Titanium Rapid Library Preparation Kit (Rapid Library Preparation Method Manual 3.4). After ligation, purification of amplicon libraries was performed with AMPure beads, and assessment of library quality was done using the Agilent Bioanalyzer with High Sensitivity DNA Chip. Library quantification was performed based on fluorometric measurements with Qubit High Sensitivity dsDNA reagent. Equimolar mixing of the libraries was performed by MIDs identifying different samples with different MID adaptors. Amplicon library pools were then amplified by emPCR at a 0.5 DNA molecule per bead ratio using the Lib-L emPCR Kit. Since amplicon lengths were short, the emPCR procedure was performed with reduced Amp Primer quantity (emPCR Amplification Method Manual – Lib-L, GS Junior Titanium Series, Live Amp Mix for paired end libraries). Bead enrichment and sequencing were performed using the GS Junior Titanium Sequencing Kit and the method described in the Sequencing Method Manual, GS FLX Titanium Series.
The Smith-Waterman algorithm with Gotoh’s improvement was used for matching the reads to template sequences in the JAligner software package [
16,
17]. As 454 technology can result in sequencing errors with homopolymer stretches e.g. in bisulfite-sequencing templates [
18], gaps or insertions were frequently observed in the sequenced reads. Reads with a minimum of 80 % of maximum alignment score were analysed further, after which the actual nucleotides at the potential methylation sites were summarised.
miRNA analysis
miRNA analysis was performed on an independent formalin-fixed, paraffin-embedded (FFPE) sample set including CRC (
n = 3), adenomas (
n = 3) and NAT (
n = 3) samples. miRNA isolation was performed with the High Pure miRNA kit (Roche) and the expression of approximate 800 miRNA were assessed on Human Panel I + II (Exiqon) with the miRCURY
TM Universal RT microRNA PCR protocol according to the manufacturer’s instructions. Normalisation of raw Ct data was performed with interplate calibrators followed by miR-423-5p, as a housekeeping gene expressed at relatively constant levels in our analysed samples.
In silico miRNA prediction was performed for all analysed genes using the miRWALK database prediction algorithm including validated mRNA targets [
19] in order to select experimentally verified miRNA interaction information associated with genes, pathways, organs, diseases, cell lines, OMIM disorders, and literature on miRNAs. Subsequently, expression of selected miRNAs in normal, adenoma and cancer samples was compared.
Immunohistochemistry
Among the analysed 18 genes, SFRP1 protein level was analysed because of the special interest of our working group. Surgically removed colonic tissues from NAT (n = 10), AD (n = 10), and CRC specimens (n = 10) were fixed in formalin and embedded in paraffin and tissue microarrays (TMA) were constructed. Four μm sections were cut, deparaffinised, and rehydrated. For SFRP1 staining, antigen retrieval was performed in TRIS EDTA buffer (pH 9.0) using a microwave (900 W for 10 min, 340 W for 40 min). Samples were incubated with anti–SFRP1 rabbit polyclonal antibody (ab4193, Abcam, Cambridge, UK) diluted 1:800 for 60 min at 37 °C. EnVision + HRP system (Labeled Polymer Anti-Mouse, K4001, Dako) and diaminobenzidine-hydrogen peroxidase–chromogen substrate system (Cytomation Liquid DAB + Substrate Chromogen System, K3468, Dako) were used with hematoxylin counterstaining. Slides were digitalised using the Pannoramic Scanner p250 Flash instrument (software version 1.11.25.0, 3DHISTECH Ltd., Budapest, Hungary), and analysed with a digital microscope software (Pannoramic Viewer, v. 1.11.43.0. 3DHISTECH Ltd., Budapest, Hungary). The semiquantitative Quick-score (Q) method was applied for SFRP1 protein level alteration analysis. Every TMA core was scored by multiplying the percentage of positive cells by the given intensity value (0 for no staining, +1 for weak, +2 for moderate, and +3 for strong diffuse immunostaining).
Discussion
The goal of this study was to identify DNA methylation and miRNA markers associated with the sequence of adenoma-carcinoma formation leading to CRC. The candidate markers were selected based on whole genome gene expression array data, DNA methylation analysis, and in silico prediction and validation of miRNA expression.
The study identified set of 18 transcripts showing continuous gene expression alterations that correlated with CRC progression. Microarray experiments revealed 12 genes (BCL2, CDX1, CYP27B1, ENTPD5, MAL, PRIMA1, PTGDR, PTGS2, SFRP1, SOCS3, SULT1A1, and TIMP1) with significantly different transcriptional activities in AD compared to NAT controls, while 6 genes (ALDH1A3, COL1A2, FADS1, SFRP1, SULF1, and THBS2) showed unique gene expression alterations only in CRC samples. More specifically, looking at cellular components of the abovementioned stages of CRC formation, the results showed that epithelial cells in AD express decreased amounts of SOCS3 and PRIMA1, whereas those in CRC express less BCL2, CYP27B1, COL1A2, FADS1, and SULT1A1.
Demethylation treatment of colon adenocarcinoma cell lines led to varying degrees of upregulation of certain transcripts. In HT-29 cell line ALDH1A3 and SOCS3 was found to be upregulated by 0.5 μM 5-Aza. Interestingly, in HCT-116 cells PTGS2; and in SW480 cell line TIMP1 showed higher expression after 0.5 and 3 μM 5-Aza treatments, as well.
From the resulting marker set,
COL1A2,
SFRP2, and
SOCS3 were hypermethylated and
THBS2 was hypomethylated in both AD and CRC samples compared to NAT. Based on the literature, hypermethylation of
COL1A2 was confirmed in head and neck cancer [
20], melanoma [
21], and bladder cancer [
22]. This is suggestive that
COL1A2 may contribute to the formation of various cancers by modulating cell proliferation and migration. In the gastrointestinal tract, expression of
COL1A2 may be associated with endothelial-to-mesenchymal transition [
23]. Collagen production of carcinoma cells decreases during oncogenic transformation [
24]; and, hypermethylation of
COL1A2 was confirmed in several CRC cell lines (HCT 116, SW480, and SW620) as well as in primary CRC tissues [
25].
SFRP2 is a member of the well-known inhibitors of Wnt pathway, abnormal activation of which (e.g. via
APC mutation or beta-catenin translocation) is a frequent and early event in the genesis of CRC [
26]. It has already been shown to be hypermethylated in colorectal cancer cell lines (e.g. HCT116) as well as primary CRC [
27,
28]. Furthermore, it has recently been recognised as a promising and sensitive marker of stool-based screening of CRC [
26].
SOCS3 is a negative regulator of the JAK-STAT3 pathway; therefore, it may effect cell proliferation and cell cycle [
29]. Mutational analysis of the gene revealed no marked association between
SOCS3 promoter region polymorphisms and the risk of developing metastatic colorectal cancer [
30]. Epigenetic inactivation of
SOCS3 was reported in human malignant melanomas and glioblastoma multiforme [
31,
32]. Reduced gene expression of
SOCS3 was found in the colitis ulcerosa (UC) to CRC progression from low-grade dysplasia to CRC. Related to this, DNA methylation of
SOCS3 could also be detected in colonic biopsies of UC-CRC patients but not from healthy controls or from inactive UC patients [
33,
34].
THBS2 hypermethylation might be responsible for altered expression of thrombospondin-2 protein in ovarian cancer and endometrial adenocarcinomas [
35]. Thrombospondin-2 is an antiangiogenetic factor in CRC and its expression was associated with angiogenesis and metastasis formation inhibition in CRC [
36].
The set of
BCL2, PRIMA1, and
PTGDR showed hypermethylation only in CRC.
BCL2 (B-cell CLL/lymphoma 2) is an apoptotic inhibitor. Its hypermethylation was documented in breast cancer [
37] and bladder cancer [
38]. Bcl-2 protein plays a role in CRC formation [
39] and has a reduced expression in CRCs with microsatellite instability [
40]. DNA hypermethylation of
BCL2 was detected in CRC cases; however, there was no relationship between gene expression and methylation of specific CpG sites [
41].
PRIMA1 encodes a membrane protein anchoring acetylcholinesterase to cell membranes [
42]. Its promoter hypermethylation was detected in major depressive disorder with a concomitant decrease in gene expression [
43]. It has not yet been associated with CRC development. Decreased mRNA expression levels of
PTGDR genes in colorectal AD and CRC caused by DNA methylation were previously described [
8].
In summary MAL, PRIMA1, PTGDR and SFRP1 showed a downregulation of gene expression and in parallel increasing DNA methylation level that correlated with CRC development. Meanwhile, BCL2, CDX1, ENTPD5 and SULT1A1 dowregulation was not accompanied with significant DNA methylation changes; thus, other regulatory processes should be further investigated to understand these changes in gene expression.
After DNA methylation analysis of candidate genes with altered gene expression, the potential influence of DNA methylation on the protein level was also investigated. Significantly decreasing protein levels of
SFRP1 could be observed along the adenoma-carcinoma sequence. This result is in accordance with the literature, as epigenetic regulation of
SFRP1 can lead to decreased protein levels [
44,
45].
On a limited sample set miRNAs with upregulation along the AD-CRC sequence were also identified. miR-21 was found to be remarkably upregulated in AD and CRC samples compared to NAT controls. On the basis of
in silico prediction miR-21 can target genes showing no remarkable alteration in their promoter methylation (e.g.
BCL2, MAL, PTGS2) during CRC development, that might influence their gene expression levels. miR-21 is known to play role in tumour formation and was also found to be upregulated in CRC tissues along tumour formation [
46,
47]. The expression level of miR-21 is elevated both in colorectal adenomas and cancers, and the degree of upregulation correlates with more advanced stages of CRC [
7]. This small non-coding RNA could have a fundamental role in the progression of CRC, as elevated level of miR-21 was found to be predictive of poor survival [
48], that may increase proliferation, migration and invasion. In CRC cell lines with the EMT phenotype the expression of miR-21 oncomiR is regulated by AP-1 and ETS transciption factors and also by epigenetic factors. Activating histone modifications (H3K3me3, H3K914ac, H3K27ac), but no inactivating were detected on miR-21 promoter region [
49]. These epigenetic mechanisms can affect the binding affinity of transcription factors to the miR-21 promoter regulating its expression level. Upregulated miR-181 in CRC cases might also influence gene expression level of the Bcl-2 family members [
50].
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
AK, BP, PH, and AVP performed the DNA methylation analysis; OG, SS, KT, and KL performed the microarray experiments; BW and AB conducted bioinformatical analyses; GV analysed immunhistochemistry results; NZS performed miRNA expression experiments, ZT, IK, BM contributed to the design and critical review of the manuscript. All authors read and approved the final manuscript.