Background
As in most Western countries, colorectal cancer (CRC) is a major public health issue in France, where it is the second most common cause of death from cancer among adults [
1]. Unfortunately, CRC diagnosis is often made at too late a stage and this induces a dismal prognosis, emphasizing the need for prevention and early diagnostic tools [
2]. The development of such tools is, nonetheless, highly dependent on the form of cancer being screened. Thus, reliable genetic tests are already available to detect phenotypically well-characterized familial forms of CRC associated with high-penetrance alleles of genes, including APC in familial adenomatous polyposis (FAP), DNA mismatch repair genes in Lynch syndrome or hereditary non-polyposis colorectal cancer (HNPCC), and MUTYH in MUTYH-associated polyposis (MAP) [
3]. However, familial diseases together probably make up only a small percentage of all CRCs [
2], and the development of comparable screening tests appears more arduous and remote for sporadic cases, which account for the large majority of CRCs.
In contrast with their major role in familial CRCs, heritable factors are not the only components of the etiology of sporadic CRCs. Susceptibility to sporadic CRCs is multifactorial and derives from multiple interactive combinations of numerous low-penetrance alleles and relevant environmental or behavioral risk factors [
2,
4]. Independently, each low-penetrance allele contributes modestly to the increase in CRC risk, but its interactions with other susceptibility alleles and environmental factors can lead to a substantial increase in CRC risk, especially when exposed to certain dietary and lifestyle habits [
4‐
6]. The number of interactions is all the higher because the susceptibility genes can be involved in many different biological pathways, which explains the extremely variable phenotype encountered in sporadic CRCs.
Despite much criticism for their non-reproducibility and weak statistical power [
6], genetic association studies have been widely used to decypher the mechanisms of cancer susceptibility. Besides, their quality has improved greatly over the past few years [
7]. They recently started to produce very valuable results, as illustrated by the identification of several susceptibility loci for colorectal cancers [
8‐
12]. Great expectations can now be held about the results and positive consequences on medical oncology provided by such studies. Beyond the search for susceptibility genes, a global effort is currently being made in the field of sporadic cancers, in order to determine which combinations of genetic variants, i.e., which genetic backgrounds, present at non rare frequencies in the general population, are likely to confer an increase in cancer risk, either alone, or by interacting with usual environmental factors [
4,
5,
13,
14].
In order to be part of this effort, we have conducted a case-controlled genetic association study based on a large French population sample. This one already turned out to be very useful by contributing to the finding of the new CRC susceptibility locus at chromosome 8q24.21 made by Zanke et al [
9]. Yet, in the present study, our purpose was somehow more modest as we did not attempt to identify new susceptibility loci or variants by a pangenomic approach. Through an exploratory study, we tried to find out whether combinations of variants already known for their possible involvement in carcinogenesis – especially in various Caucasian populations – could determine "genotypic profiles" at risk for CRC among individuals of our French study population. Thus, we focused on 52 allelic variants of 35 candidate genes, selected through a review of the literature on CRC susceptibility, and drawn from five biological pathways relating to inflammation, xenobiotics detoxification, one-carbon, insulin signaling, and DNA repair. Here, we report the results of our investigation on the risk for CRC associated with all 52 allelic variants, analyzed singly or in combinations.
Results
The allele frequencies we found for each of the 52 polymorphisms tested were consistent with those reported in literature and in dbSNP
http://www.ncbi.nlm.nih.gov/projects/SNP/ for Caucasian populations. Two of the 52 polymorphisms analyzed turned out to be monomorphic. The 50 others were all at Hardy-Weinberg equilibrium. Results of the unconditional logistic regression analyses for CRC association with polymorphisms considered independently are described in Table
1. Alternative results of conditional logistic regression restricted to 811 age- and sex-matched pairs of individuals are reported in Additional file
3.
Six associations between CRC risk and allelic variants were determined by both unconditional and conditional logistic regression analyses. For SNPs
PTGS1 c.639C>A (p.Gly213Gly),
IL8 c.-352T>A, and
MTHFR c.1286A>C (p.Ala429Glu), minor alleles appeared associated with an increase in CRC, whereas for SNPs
PLA2G2A c.435+230C>T,
PPARG c.1431C>T (p.His477His), they were associated with a decrease in CRC risk. Application of the false discovery rate to the results obtained by unconditional logistic regression analysis showed a value of about 0.5 for these five findings, suggesting that the associations were not very robust. Yet, on the contrary, q-values determined from conditional logistic regression analysis were found between 0.07 and 0.26 for the same findings, rather suggesting quite robust associations [Additional file
3]. Our analyses also showed a protective effect of the variant
GSTM1 null carried at the heterozygous state on CRC risk. However, this observation is not relevant at the biological level since the loss of activity of the GSTM1 enzyme, or GSTM1 deficiency, is associated with a deletion carried on both alleles. In fact, complementary analyses in our study population revealed that individuals carrying two deleted alleles did not have a significantly different risk of CRC compared to carriers of one or zero deleted alleles (OR 1.08, 95% CI 0.91–1.21, p = 0.4), which is consistent with two meta-analyses performed on this polymorphism [
19,
38]. Therefore, the CRC-predisposing effect observed for
GSTM1 null allele is to consider with great caution, all the more because the q-value of 0.4 calculated for this finding would rather suggest a false positive result. Two additional CRC-predisposing effects were observed for SNPs
PLA2G2A c.-859C>G and
CYP1B1 c.1294C>G by conditional logistic regression analysis [Additional file
3], but were not found when using unconditional method applied to the whole study population; they were therefore considered as false positives.
Since five SNPs were undoubtfully found associated with a modification of CRC risk in single-SNP analyses, we focused our further analyses on combinations of five allelic variants, as described in Methods. Among all the genotypic combinations tested, only one showed a statistically significant association with an increased risk of CRC, which appeared robust according to the calculation of FDR (q-value < 0.1). This combination was composed by the very same five SNPs that had been found independently associated with CRC risk from single-SNP analyses. Additional file
4 uses the example of this precise set of five SNPs to illustrate the different models of genotypic combinations tested for every set of five polymorphisms – picked from the 52 polymorphisms of the study – which we tested. The most predisposing patterns for this combination, presented in Table
2, combine genotypes 1 and 2 for
PTGS1 c.639C>A,
IL8 c.-352T>A, and
MTHFR c.1286A>C, and 0 for
PLA2G2A c.435+230C>T and
PPARG c.1431C>T. These predisposing patterns appeared to be associated with a highly significant increase in colorectal cancer, either compared to the reference pattern alone (OR 2.65; 95% CI 1.58–4.42 with p = 0.0005), or compared to reference and mixed patterns gathered (OR 1.97; 95% CI 1.31–2.97 with p = 0.0009).
Table 2
Analysis of association between combinations of genotypes and risk for colorectal cancer, in the entire study population.
0 | 0 | 0 | 0 | 0 | Null or very weak (reference pattern) |
0 | 1 or 2 | 1 or 2 | 0 | 0 | Protective (protective patterns) * |
1 or 2 | 0 | 0 | 1 or 2 | 1 or 2 | Predisposing (predisposing patterns) |
Other genotypes | Average (mixed patterns) |
B. Analysis of observed combination of genotypes association with colorectal cancer (n = 2144, adjusted by sex and age) |
Patterns of genotypic combinations
|
Controls
|
Patients
|
OR (95% CI)
|
P-value**
| |
Reference pattern | 95 (8.5%) | 63 (6.2%) | 1.00 | | |
Protective patterns | 7 (0.6%) | 4 (0.4%) | 0.86 (0.24–3.06) | 0.8180 | |
Mixed patterns | 978 (87.2%) | 890 (87.0%) | 1.37 (0.98–1.90) | 0.0602 | |
Predisposing patterns | 41 (3.7%) | 66 (6.4%) |
2.65 (1.58–4.42)
|
0.0005
| |
Reference and mixed patterns | 1080 (96.3%) | 957 (93.5%) | 1.00 | | |
Predisposing patterns | 41 (3.7%) | 66 (6.5%) |
1.97 (1.31–2.97)
|
0.0009
| |
Validation of the model by bootstrapping and examination of internal consistency strengthened the above results. By bootstrapping, we calculated a mean ± SD OR of 1.86 ± 0.41 (95%CI 1.23–2.85, p = 0.003). Examination of internal consistency confirmed a trend in an increased CRC risk associated with the predisposing combination patterns, regardless of the mode of stratification used [Additional file
5]. No effect of gender or anatomical sub-location was noted. Most of the associations observed were statistically significant, but certain stratifications determined according to geographical origin and/or age displayed a greater effect regarding the predisposing combination patterns (Table
3). Among the 2144 individuals composing the whole study population, the strongest association was found in individuals of ≤ 67 years of age and originating from the French
département of the Vendée.
Table 3
Replications of genotypic combinations analyses on stratifications of the patients and controls groups determined according to geographical origin and/or age.
Reference pattern | 14 | 29 | 1.00 | | 65 | 31 | 1.00 | | 12 | 16 | 1.00 | |
Mixed patterns | 171 | 359 | 0.92 (0.47–1.83) | | 676 | 460 | 1.42 (0.91–2.21) | | 120 | 195 | 1.15 (0.52–2.56) | |
Predisposing patterns | 4 | 32 |
3.80 (1.10–13.19)
|
0.011
| 27 | 38 |
3.00 (1.56–5.77)
|
0.003
| 2 | 21 |
7.22 (1.38–37.80)
|
0.009
|
Reference and mixed patterns | 185 | 388 | 1.00 | | 741 | 491 | 1.00 | | 132 | 211 | 1.00 | |
Predisposing patterns | 4 | 32 |
4.14 (1.42–12.08)
|
0.002
| 27 | 38 |
2.17 (1.31–3.61)
|
0.003
| 2 | 21 |
6.36 (1.44–28.09)
|
0.002
|
Discussion
In this study, we have used a candidate gene approach to examine the associations between colorectal cancer risk and 52 allelic variants distributed in 35 genes drawn from pathways of inflammation, metabolism of xenobiotics detoxification, one-carbon, insulin signaling, and DNA repair. To our knowledge, we are describing here within one of the most comprehensive investigations on populations of this kind, covering more than 1000 patients with sporadic colorectal cancer, and 1000 controls, i.e., in the range of 500–2000 case-control pairs defined by Brennan to detect the statistically significant effect of polymorphisms [
39]. Obviously, the list of polymorphisms which we designed here is non-comprehensive, and it must be seen as a panel test, a first attempt in the search for CRC-predisposing "genetic profiles".
With the exception of the 4 SNPs in
PLA2G2A that we had selected by dHPLC, all the allelic variants chosen for the present study had previously been found to be associated with a modification of CRC risk in at least one study. However, we have been able to replicate the association with CRC risk for only 5 of the 52 polymorphisms tested (Table
1). By independent single-SNP analyses, three SNPs were shown to increase CRC risk:
PTGS1 c.639C>A,
IL8 c.-352T>A, and
MTHFR c.1286A>C. Two other SNPs,
PLA2G2A c.435+230C>T and
PPARG c.1431C>T, were on the contrary associated with a decrease in CRC risk. Combinations of the CRC-predisposing alleles relating to these five variants determine "genotypic profiles" at significantly higher risk of CRC (OR 1.97, 95% CI 1.31–2.97). Among the individuals exhibiting these profiles, younger individuals (≤ 67 years) from the Vendée seem to be even more predisposed to CRC (OR 6.36, 95% CI 1.44–28.09). However, the size of the population sample analyzed here is too small to draw definitive conclusions on any age- or geographical-effect. In the same way, the consistency of the risk profile was lost in some of the study population subgroups used for the analyses reported in Additional file
5.b, certainly because of their relatively small size. Caution is therefore required, when considering the true significance of this risk profile, even though bootstrapping results tend to show its robustness [Additional file
5.a]. It is noteworthy that a sixth polymorphism, GSTM1 null allele, showed a protective effect in independent analyses of allelic variants, but the significance of its association with CRC risk remained dubious, and was not confirmed by multiple-SNP analyses, contrary to the five other polymorphisms.
To investigate further the CRC-predisposing combinations that emerged from our analyses, we tested their possible interactions with the environmental and lifestyle risk factors reported in our study questionnaire (physical activity, cooking methods, and consumption of alcohol, tobacco, red meat, cold cuts, white meat and poultry bread, dairy products, fish, fruits, pastries, or vegetables), by use of SNPStats [
40]. Indeed, in the same study population, we had previously observed an interaction between allelic combinations of CYP genes and consumption of red meat which leads to a strong increase in CRC risk [
15]. In the present study, we did not observe any comparable gene-environment interaction that could remain statistically significant throughout multiple adjustment and/or test for internal consistency in patient and control sub-groups (data not shown). Given that four of the SNPs composing the CRC-predisposing combinations are related to inflammation, the most relevant "environmental" factor to be tested here would have actually been NSAIDs treatment. But, since this item did not figure in our questionnaire on life-habits [
15] – we assumed that it would have introduced a bias in the design of the control group-, we were unfortunately unable to analyze its interaction with the CRC-predisposing genotypic combinations of the five SNPs mentioned above.
The respective biological impact of the five variants
PTGS1 c.639C>A,
IL8 c.-352T>A,
MTHFR c.1286A>C,
PLA2G2A c.435+230C>T, and
PPARG c.1431C>T provide some clues to the predisposition to CRC associated with some of the genetic combinations that they compose. Thus,
PLA2G2A c.435+230C>T and
PTGS1 c.639C>A, belong to genes which encode proteins following each other in the enzymatic cascade of the arachidonic acid pathway. PLA2G2A catalyzes the hydrolysis of membrane phosopholipids, thereby releasing unsaturated fatty acids, including arachidonic acid [
41]. The latter becomes the substrate of PTGS1, which in turn catalyzes the formation of prostaglandin H2 (PGH2), a precursor for a number of inflammatory molecules – eicosanoids – that promote colorectal carcinogenesis [
42]. Until now, out of the potential key players in the CRC process, more attention has been given to another element of the arachidonic acid pathway,
PTGS2, rather than to
PTGS1 or
PLA2G2A [
31,
43]. The interest for
PTGS2 in CRC risk stems from its expression induced by pathophysiological conditions such as tumorigenesis or inflammatory situations [
16]. However, we found no significant modification of CRC risk associated with any of the four
PTGS2 SNPs we tested (Table
1). In contrast,
PTGS1 is constitutively expressed and its implication in CRC risk has not been investigated very much, since its carcinogenetic role through induction of
PTGS2 has been suggested only recently [
16,
34]. Thus, to our knowledge, the synonymous polymorphism c.639C>A (p.Gly213Gly) that we found as having a predisposing effect on CRC (OR 1.24) at the heterozygous (CA) or homozygous state (AA), had not been examined within this context to date and its precise impact on PTGS1 activity would require further functional studies. In the same way, the four
PLA2G2A SNPs we studied have not yet been tested within the CRC risk context. Therefore, there is no possible point of comparison for the protective effect relating to CRC that we found for the allele c.435+230T carried at homozygous state (OR 0.80, 95%CI 0.66–0.98). Because of its localization within the Mom-1 (Min modifier) locus, and its promotion of tumors in APC
Min mice, it had first been suggested that
PLA2G2A may represent a tumor suppressor gene involved in familial forms of CRC [
44]. As a result, such a role of
PLA2G2A in FAP genesis had been ruled out in humans [
41,
45]. However, the contribution of
PLA2G2A to sporadic CRC predisposition cannot be excluded, and it has been suggested that the increase in
PLA2G2A expression could cause the accumulation of arachidonic acid, a molecule likely to have pro-apoptotic properties [
41]. A hypothesis for the protective role of variant c.435+230T might therefore be its enhancement of
PLA2G2A expression.
PPARG c.1431C>T (aka C161T) and
IL8 c.-352T>A, also relate to inflammation or immune response. PPARγ (peroxisome proliferator-activated receptor gamma) is a nuclear receptor tightly linked to the arachidonic acid pathway, in that it is activated by various eicosanoids [
46]. Essential to adipocyte differentiation and to regulation of lipid metabolism, PPARγ is thought to have overall tumor suppressive properties, exerted notably in CRC [
46‐
48]. A protective effect on CRC risk had been reported for the rare allele of variant
PPARG c.36C>G (p.Pro12Ala) [
49,
50], but we were not able to reproduce this observation in our study population, perhaps because the reference studies were designed on smaller populations – about 200 case-control pairs-. On the other hand, we noted a significantly decreased risk of CRC associated with the rare allele of variant c.1431C>T (p.His477His) at the homozygous state (OR 0.30, 95%CI 0.14–0.65). These results are consistent with those reported in an earlier study on colorectal adenoma [
48], but they are inconsistent with the contrary observations made by another team investigating CRC [
33]. Given the controversial effect assigned to
PPARG c.1431C>T, functional assays would be required to understand the exact role of this little-investigated polymorphism.
As regards
IL8, it encodes a pro-inflammatory chemokine, released by infiltrating lymphocytes, in response to exposure of the colonic epithelium to toxic and pathologic challenges [
50]. The allelic variant
IL8 c.-352T>A has been found to be associated with CRC risk, even though the effect attributed to rare allele A goes from null [
51], to predisposing or even protective [
17,
52]. As regards our study, we found that genotypes c.352AA or c.352TA were associated with an elevated risk of CRC compared to genotype c.352TT (OR 1.21, 95%CI 1.01–1.46). The debate on the true carcinogenetic role of variant c.-352T>A probably comes from its controversial impact on
IL8 transcription, sometimes described as an enhancement [
51], sometimes as a downregulation [
52]. As
IL8 contributes to chronic inflammation, it would make sense that its overexpression would increase the risk for CRC, and therefore the allele c.-352A may rather enhance the transcriptional activity of the gene.
The fifth gene highlighted by our analyses,
MTHFR (5,10-methylenetetrahydrofolate reductase), is part of the one-carbon metabolic pathway, involved in both DNA methylation and DNA synthesis. The MTHFR enzyme synthesizes 5-methyltetrahydrofolate, the primary circulatory form of folate, which represents a fundamental methyl donor in cellular metabolism [
26]. The influence of MTHFR activity on folate status could be important in CRC neoplasia, since folate deficiency could cause DNA hypomethylation, and/or induce uracil misincorporation during DNA synthesis, which would lead to mutations and chromosomal damage [
26,
27,
53]. Two polymorphisms c.665C>T (also reported as C677T) and c.1286A>C (aka c.1298A>C), have been widely tested for modification of CRC risk, because of the reduction in MTHFR activity induced by their minor allele. In the present study, we found no effect for c.665C>T, whereas we observed an elevated risk of CRC associated with genotypes c.1286CC or c.1286AC compared to genotype c.1286AA (OR 1.21, 95%CI 1.02–1.44). At first sight, these results appear difficult to compare with the greatly inconsistent or even conflicting earlier results, and reviewed in two recent comprehensive meta-analyses [
54,
55]. However, this discrepancy in the overall results could be explained by the variation of the polymorphisms effect according to folate status. Mu et al. suggested indeed that genotypes reducing MTHFR activity – here c.1286CC or c.1286AC – would favor cancer risk when dietary folate levels are low, by inducing a DNA hypomethylation causing DNA damage and mutations [
56]. When folate levels are adequate, the reduced MTHFR activity induced by the same genotypes would lead to a great pool of methylenetetrahydrofolate available for DNA synthesis, and therefore prevent cancer by diminishing uracil misincorporation. Moreover, it has been suggested that, according to its administration prior or further to the existence of preneoplastic lesions, folate would rather prevent or increase tumor development, respectively [
53].
All these biological data considered, our work underlines the important contribution of inflammatory processes to CRC susceptibility in our study population, and it points to a possible crosstalk between inflammation and one-carbon pathways. The four allelic variants of genes
PTGS1,
PLA2G2A,
PPARG, and
IL8 might favor inflammatory processes, whereas the
MTHFR allelic variant could induce a DNA hypomethylation altering the expression of the putative tumor suppressor gene
PPARG. Two recent studies on diabetes and cardiovascular diseases have suggested a common pathobiological mechanism between the inflammation process and genotype TT of
MTHFR c.665C>T (aka C677T), i.e., a genotype inducing a lower MTHFR activity like genotype
MTHFR c.1286CC [
57,
58]. However, our results do not enable to conclude to an interaction between the five allelic variants. Indeed, the odd ratios associated with the observed CRC-predisposing combinations are certainly not strong enough, and our method is not the most appropriate one for an investigation on gene-gene interactions.
Authors' contributions
SK, BB, and SRdP equally contributed to this work. SK participated in the design of the study, carried out statistical analyses, contributed to genotyping assays, and drafted the manuscript; BB conceived the study, participated in its design, and coordinated the recruitment of patients and controls; SRdP participated in statistical analyses and genotyping assays, and helped to draft the manuscript; CS did samples management and contributed to genotyping assays; HC contributed to molecular genetic analyses; TLN coordinated data storage and management; CLH coordinated the recruitment of patients and controls; RF and JO contributed to the collection of patients samples; BL and LDC contributed to the collection of controls samples; VS participated in statistical analyses; SB conceived the study, participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.