Background
All common cancers show some degree of familial clustering [
1]. Most of the familial aggregation, especially in breast cancer [
2], results predominantly from inherited susceptibility [
3]. Linkage studies in the 1990s led to the discovery of several predisposition genes associated with many rare familial cancer syndromes, thus providing fundamental insights into various pathways of carcinogenesis [
4]. Nevertheless, this approach has mainly been limited to genes with relatively rare, highly penetrant alleles, for several reasons, such as a lack of power to detect alleles conferring modest or moderate risks that are believed to be involved in common cancers [
1,
5‐
7]. Analyses of risk attributable to such alleles in the known breast cancer susceptibility genes (e.g.
BRCA1, BRCA2, TP53, PTEN, ATM) suggest they are responsible for ~25% of the familial component of breast cancer risk [
6,
8,
9]. The number and properties of genetic variants that account for the remaining 75% of inherited risk are largely unknown. It has been proposed that a complex polygenic model is the best explanation for this missing genetic risk [
10,
11] and perhaps the majority of breast cancers arise in a susceptible minority of women [
2,
12].
Under the Common Variant/Common Disease (CV/CD) model, disease susceptibility is suggested to result from the joint action of several common variants, with unrelated affected individuals sharing a substantial proportion of disease alleles [
13‐
15]. The alternative is the heterogeneity hypothesis, which maintains that genetic susceptibility to common disease is caused by many different rare genetic variants, with a relatively large effect produced by each allele [
16‐
19]. If most cancer susceptibility is related to fundamental processes of cellular control, rare alleles might turn out to be the more important component and should be detectable by linkage analysis and/or the candidate gene re-sequencing approach [
5,
6].
The central role of
BRCA1 and
BRCA2 genes in DNA repair, recombination, cell cycle control and transcription [
20,
21] has led to the investigation of the implication of several similarly acting genes in breast and/or ovarian cancer predisposition, including
ATM (Ataxia telangiectasia-mutated) [
22‐
27],
CHEK2 [
28,
29],
TP53 [
30],
PTEN [
31],
STK11 [
32] and a few other genes involved in DNA repair [
33]. Ataxia-telangiectasia-mutated and Rad3-related (ATR) is a member of the phosphatidyl inositol-kinase (PIK)-related family which plays, along with ATM, a central role in cell-cycle regulation, by transducing DNA damage signals to downstream effectors of cell-cycle progression [
34]. In response to double-strand breakage, stalled replication forks or DNA adducts, ATR complexed with ATR-interacting protein (ATRIP) is recruited and then phosphorylates a number of proteins involved in DNA damage, including H2AX, 53BP, TP53, NBS1 and CHEK1 [
35‐
38], thereby activating cell checkpoints, DNA repair or apoptosis. ATR is also able to bind to Rad17 and BRCA1 and to associate with components of the nucleosome remodeling and deacetylating complex [
39‐
41]. Furthermore, ATR has recently been shown to interact with the Fanconi Anemia complex [
42], which growing number of evidences link to the two
BRCA genes [[
21], for review see [
43]]. A recent study has also demonstrated that the Mre11/Rad50/NBS1 (MRN) complex, a central component in the cellular response to ionizing radiations and other causes of double-strand breaks, is required for ATR-dependant phosphorylation mechanisms of the protein Smc1 (Structural maintenance of chromosomes 1) [
44]. ATR knockout studies showed that ATR is essential for somatic cell growth and genomic integrity in the embryo and that its deletion leads to genomic disruption and early embryonic lethality in mice [
45,
46]. Moreover, it has been reported that disruption of the
ATR gene leads to an increase in the incidence of large benign tumors in heterozygotes, possibly indicating that deficiency in ATR affects the rate of tumor initiation [
45].
Based on the major role of ATR in cellular response to DNA damage and its multiple interactions with several proteins such as BRCA1 [
40,
47],
ATR represents an attractive candidate gene to potentially explain a fraction of the remaining breast cancer susceptibility. The current study was designed to assess the possible involvement of
ATR germline mutations in breast cancer susceptibility. For this purpose, the complete sequence of the 47 exons and flanking intronic sequences of the
ATR gene were analyzed in DNA samples from individuals affected with breast cancer from non-related
BRCA1- and
BRCA2-negative high-risk French Canadian breast/ovarian families.
Methods
Ascertainment of families and DNA extraction
The recruitment of high-risk French Canadian breast and/or ovarian families started in 1996 through a research project, which thereafter evolved in a large ongoing interdisciplinary research program designated INHERIT BRCAs. More details regarding ascertainment criteria, experimental and clinical procedures as well as the INHERIT BRCAs research program have been described elsewhere [
48‐
52]. A major component was to identify and characterize
BRCA1 and
BRCA2 mutations in French Canadian high-risk families (CGL cohort) [
52].
Subsequently, another component was designed for the "Localization and identification of new breast cancer susceptibility loci/genes". Ethics approval for this latter study was also obtained from the different institutions participating in this research project and each participant knowing their inconclusive BRCA1/2 test results status had to sign a specific informed consent for their participation in this component. A subset of 54 high-risk French Canadian breast/ovarian cancer families were recruited in the present study according to the following ascertainment criteria 1) three or more breast cancer cases diagnosed before the age of 65 (48 families), 2) two or more breast cancer cases (<65) if one breast cancer was diagnosed before 45 years (5 families), 3) or when there was a strong family history of breast/ovarian cancer (e.g. daughter-mother-grand-mother) (1 family). All participants had to be at least 18 years of age and mentally capable. The diagnoses of breast and/or ovarian cancer were confirmed by obtaining a pathology report, and when two or more subjects were available within a family, the youngest subject was systematically chosen for this study. The mean age at diagnosis of these 54 subjects affected with breast cancer was 45.5 years old (30–59 years), while 46 of them have been diagnosed before 50 year of age and 11 were affected by more than one breast cancer case. The analysis of the breast cancer history revealed that 15 (28%), 18 (33%) and 19 (35%) families included 1–2, 3 or ≥ 4 case(s) in at most 2nd degree relatives, respectively. When including all breast cancer cases in the family history occurring in at most 3rd degree relatives from the index case, 10 (18%), 16 (30%) and 28 (52%) families have 1–2, 3 or ≥ 4 case(s), respectively.
The
BRCA1/2 status of each participant was previously assessed [
52]. Briefly, to this day, genomic DNA samples have been first tested for a panel of 29 mutations, including 26 truncating mutations and 3 unclassified variants (two missense mutations and one in-frame deletion), observed and/or reported in the French Canadian population [
52]. Thereafter, DNA samples of individuals included in this study were sent to Myriad Genetic Laboratories (Salt Lake City, Utah, USA) for full-length
BRCA1/2 sequencing following their
Comprehensive BRACAnalysis
®
-BRCA1 and BRCA2 gene sequence analysis for susceptibility to breast and ovarian cancer test, with the exception of 9 subjects for which DNA samples from another affected individual of the family (n = 7) or unaffected parents of cancer cases (n = 2) were sent to Myriad as previously described [
49,
50,
52]. Evidence of the absence of genomic rearrangements in
BRCA1/2 genes was thereafter investigated by Multiplex Ligation-dependant Probe Amplification (MLPA) for 45 of the 54 subjects and
BRCA1/2 Southern analysis for 32 of the 54 individuals. For seven of the remaining subjects, MLPA was performed on another individual of the family [
53], while for two subjects this analysis was not performed.
Genomic DNA from 73 healthy unrelated French Canadian women was obtained from Dr Damian Labuda at the Centre de cancérologie Charles Bruneau, Hôpital Ste-Justine, Montreal, Canada. The individuals who provided these samples were recruited on a non-nominative basis, in the framework of long-term studies aiming the characterization of the genetic variability in human populations, approved by the Institutional Ethic Review Board. DNA from peripheral blood was isolated by conventional methods, either phenol-chlorophorm or using Gentra kits (Minneapolis, MN, USA). The mean age of these individuals was 45.2 years old; 2 (2.7%), 26 (35.6%), 23 (31.5), 17 (23.3%) and 5 (6.8%) of them were between 25–29, 30–39, 40–49, 50–59 and 60–69 year of age, respectively.
The validation group comprised 46 BRCA1/2-negative breast cancer proband cases of French origin belonging to multiple-case breast cancer families from the following sources: high-risk breast cancer only and breast/ovarian cancer families referred for genetic testing at the Department of Preventive Medicine at Creighton University School of Medicine, Omaha, NE, and at the cancer genetic counseling unit at Centre Léon Bérard, Lyon, France, and a population-based study including women diagnosed with breast cancer below age 46 years, recruited through the Rhône region cancer registry, France. The cancer status of index cases was confirmed through pathology reports. Cancers reported in relatives were verified through pathology reports, hospital records and death certificates. Index cases have been screened for mutations in
BRCA1 and
BRCA2 [
54,
55]. All subjects provided written informed consent for participation in the study. Approval for the study was obtained from the International Agency for Research on Cancer (IARC) ethics committee.
The mean age at diagnosis of these 46 French subjects affected with breast cancer was 39.7 years old (19–61 years); 44 of them have been diagnosed before 50 year of age and five were affected by more than one cancer case. The analysis of the breast cancer history revealed that 23 (50%), 9 (20%) and 14 (30%) families included 1–2, 3 or ≥ 4 case(s) in at most 2nd degree relatives, respectively. When including all breast cancer cases in the family history occurring in at most 3rd degree relatives from the index case, 12 (26%), 14 (31%) and 20 (43%) families have 1–2, 3 or ≥ 4 case(s), respectively.
PCR amplification, mutation analysis and variant characterization
The intron-exon boundaries of the
ATR gene were determined by aligning GenBank mRNA records (NM_001184) with genomic sequence records (NC_000003).
ATR spans approximately 130 kb and is composed of 47 exons (3q22-q24: 143650778-143780349). PCR amplicons using primers designed by the Primer Express 2.0 software (Applied Biosystems, Foster City, CA, USA) covered the entire mRNA encoding portions and flanking intronic sequences from genomic DNA. Forty primer pairs were used to amplify fragments ranging in size from 351 bp to 1385 bp, which were sequenced with primers also indicated in the table [see
Additional file 1].
ATR direct sequencing was performed on an ABI3731 automated sequencer using version 3.1 of the Big Dye fluorescent method according to the manufacturer's instructions (Applied Biosystems, Foster City, USA). Sequence data were analyzed using the Staden preGap4 and Gap4 programs.
LD analysis, haplotype estimation and tagging SNP selection (tSNP)
To estimate the pattern of linkage disequilibrium (LD), all 41 SNPs identified in our breast cancer case series have been genotyped. The LDA program [
56] was used to calculate pairwise LD for each SNP pair. Lewontin's |D'| was used as a measure of LD between SNPs [
57,
58].
Haplotype analysis was performed using PHASE 2.1.1 software [
59,
60]. This program (PHASE) estimates haplotype frequencies with a Bayesian-based algorithm and then uses a permutation test to determine the significance of differences in inferred haplotypes between cases and controls. All association tests were run under default conditions, with 1000 permutations. Haplotype frequencies were estimated using the SNPs with minor allele frequency (MAF) ≥ 5% identified in both sample series (cases and controls). Haplotype blocks were identified using genotyping data from control individuals as well as using HapMap data from the CEPH cohort [
61] using the Haploview [
62,
63] software. Tagging SNPs (tSNPs) from each LD block were then identified using the same software. Splice site prediction scores were evaluated using SSPNN [
64] while protein alignment was performed using ClustalW [
65].
RNA isolation from cell lines and normal tissue samples
Total RNA was extracted using TRI Reagent
® (Molecular Research Center inc, Cincinnati, OH, USA) according to the manufacturer's instructions as previously described [
66] from 1) EBV-transformed B-lymphoblastoid cell lines from the 54 cases used for our mutation screening; 2) nine cancer cell lines obtained from the American Type Culture Collection (ATCC) including, two estrogen receptor (ER)-negative breast cancer cell lines (BT-20 and MDA-MB-231), four ER+ breast cancer cell lines (BT-474, CAMA-1, MCF7 and ZR75) and three prostate cancer cell lines (22RV1, LNCaP, PC3); and 3) the HaCat human skin keratinocyte cell line which was generously supplied by Dr. N.E. Fusenig (German Cancer Research Center, Heidelberg, Germany) [
67,
68]. Total RNA samples from normal tissues were purchased directly either from Stratagene (breast and ovary) (La Jolla, CA, USA), BioChain Institute Inc. (leukocyte) (Hayward, CA, USA), or Clontech (all other normal tissue samples) (Palo Alto, CA, USA). RNA samples were then processed as previously described [
66]. Thereafter, reverse transcription of 2.5 μg of standardized RNA samples was performed using 250 ng random hexamers and 200 U of SuperScript™ II RNase H
- Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) following the supplier's protocol.
Characterization of the Δ33 and insΔInt41 alternative splice transcripts
Δ33 alternative splice transcript
In order to investigate if the SNP c.5739-4del9+T may lead to alternative splice transcript(s), a PCR reaction was performed, using the forward primer (5'- GCAGATGGAAAATCTACAACATGGA) and reverse primer (5'- TGATTTCCATATTGTAGAGATCTGCCA) designed to allow amplification of a specific ATR cDNA fragment spanning nucleotides 5479 to 6348 of the wild-type mRNA, with cDNA samples from immortalized cell lines from two homozygous and one heterozygous individuals for this variant, as well as two wild-type individuals. PCR product lengths were analyzed by migration on 1.5% agarose gel and sequenced in both orientations. Thereafter amplified ATR cDNA fragments were subcloned in the pCRII vector (TA cloning kit from Invitrogen) according to the manufacturer's instructions. After growing colonies and extracting the plasmid DNA samples using the GFX Micro Plasmid Prep Kit (Amersham), sequencing of each amplicon was performed as described above using the forward amplification primer. The only alternative splice transcript observed, designated Δ33, yielded to a 749 bp PCR product, while the wild-type fragment length was 870 bp.
Then, to further investigate for the presence of alternative splice transcript(s), which could be associated with the SNP c.5739-4del9+T, a series of primers were designed on exon-exon junctions to amplify different cDNA fragments covering exons 30–38 using the same cDNA samples described above. Four forward primers were designed on exon junctions 29–30, 30–31, 31–32 and 32–33 (F29-30: 5'-GAACCAGACCAGATCATTCATTA-3', F30-31: 5'-TAACAGGTCCGAGTGGACAGA-3', F31-32: 5'-CAGCAGATGGAAAATCTACAACAT-3', F32-33: 5'-GTGAGATTGCACATGTTATGTGAG-3') and five reverse primers were designed on exon junctions 33–34, 34–35, 35–36, 36–37 and 37–38 (R33-34: 5'-TTGTAATCTGGTCTTTTGTTGAGG-3', R34-35: 5'-TGGTGAACATCACCCTTGGAC-3', R35-36: 5'-CACGCGGTCACATCCTTATATT-3', R36-37: 5'-CCATATTGTAGAGATCTGCCAAAAT-3', R37-38: 5'-CCAGCTTTTTCCCATTCATAT-3'). Each forward primer was used in combination with each reverse primer, thus resulting in twenty distinct overlapping PCR fragments.
Given that the intronic c.5739-4del9+T variant is located at the splice acceptor site of exon 34, four additional reverse and forward primers were designed on the putative Δ33-35 and 33–35 exon junctions to verify if there is a splice variant resulting specifically from exon 34 skipping (FΔ33-35: 5'-CCAGCATTCTCCAGGGTGA-3', F33-35: 5'-TACTAAGCCTCAACAAAAGGGTGA-3', RΔ33-35: 5'-CTGGTGAACATCACCCTGGA-3', R33-35: 5'-GCCTGGTGAACATCACCCTTT-3'). PCR reactions were performed for 40 cycles using cDNA samples from c.5739-4del9+T variant carriers, as well as two wild-type individuals, and analyzed as described above.
insΔInt41 alternative splice transcript
In order to investigate whether the SNP c.7041+8G/A may lead to alternative splice transcript(s), a strategy similar to that described above was used. Two forward primers were designed on exon junctions 38–39 and 39–40 (F38-39: 5'-GTCAAAGTCATCTTATCCCATGC-3', F39-40: 5'-AACCGGTTGATGGAAGTAGTTCCA-3') and two reverse primers were designed on exon junctions 43–44 and 44–45 (R43-44: 5'-CTACTACTGTACCATGATGTAGGATCAG-3', R44-45: 5'-GTTTCTCCCTTATTGAAAAGACAATTG-3'. Furthermore, two additional reverse and forward primers were designed specifically on putative exon41-intron41 (Forward) and intron41-exon42 (Reverse) junctions to detect a splice variant in which the exon 41–42 junction could have been altered (F41-int41: 5'-CAATTCCTTGATTAATAAGGTTG-3', R42-int41: 5'-CTCTGCATCTTTTCTTAAGCACTCA-3'). The F41-int41 primer was used in combination with R43-44 and R44-45 while the R42-int41 primer was used in combination with F38-39 and F39-40. PCR reactions were performed for 40 cycles using cDNA samples from the c.7041+8G/A heterozygote carrier, as well as two wild-type individuals and PCR products have been analyzed as described in the previous sub-section. The structure of the alternative splice variant (r.[7041_7042ins7041+1_7041+441], designated in the current study as insΔInt41, amplified using either F41-int41 or R42-int41 with appropriate external primers was confirmed by sequencing subcloned PCR products.
Quantitative RT-PCR (QRT-PCR) for the Δ33 splice transcript
Primer and Taqman probe design
Primers and Taqman probes for ATR wild-type and Δ33 splice transcript fragments were designed with the assistance of the Primer Express 2.0 software. The primer and probe sequences were; ATR wild-type: forward primer: 5'-AACTGGGTAGCTCGACTAGAAATGA-3', reverse primer: 5'-TTCATTGTAATCTGGTCTTTTGTTGAG-3', probe: 5'-FAM-TCCGGAGAGCCAGGA-3'; ATR splice transcript: forward primer: 5'-GAGGCTCCTACCAACGAGGAT-3', reverse primer: 5'-TCATTGTAATCTGGTCTGGAGAATG-3', probe: 5'-FAM-TGGTTTGATGCTATGCTC-3'. All probes were purchased from Applied Biosystems, as were primers and probe for 18S RNA, which was used as endogenous control gene.
Subcloning and standard curves
cDNA samples prepared from RNA extracted from immortalized cell lines were used to amplify by PCR three fragments corresponding to the ATR cDNA region spanning nucleotides 5748–5858 (NM_001184), the ATR splice transcript spanning nucleotides 5626–5857 and including a deletion of the last 121 nucleotides of exon 33, and a fragment spanning nucleotides 450–619 of 18S RNA. The fragments were thereafter subcloned in the pCR®II vector (TA Cloning® from Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions. To ensure amplification specificity, the reverse primer for the wild-type ATR fragment was designed on the junction between exons 33–34 while the reverse primer specific to the alternate splicing transcript was placed on the new exon 33–34 junction created by the exon 33 3'-deletion. Plasmid constructions were amplified and purified using Plasmid Maxi Kit (Qiagen, Mississauga, ON, Canada). Specific standard curves were generated by making 2-fold serial dilutions of plasmid constructions in the appropriate range for each quantitation assay.
QRT-PCR assays
QRT-PCR assays were performed in triplicate on an ABI 7900 Sequence Detection System (Applied Biosystems) as previously described [
66]. For all the assays, a reaction mixture was prepared in a final volume of 10 μl with 1X Taqman Master Mix Buffer (Applied Biosystems) which included Taq Gold polymerase, 200 nM of Taqman
® probe, 900 nM of each primer for
ATR wild-type and alternative transcript assays or 50 nM of each primer for 18S RNA assays, and cDNA samples reverse-transcribed from total RNA. The amount of cDNA used for quantitation was 15 ng for wild-type, 150 ng for splice transcript and 1.5 ng for 18S RNA.
QRT-PCR for the insΔInt41 splice transcript
cDNA corresponding to 20, 200, 500 ng and 1 μg of total RNA coming from the immortalized cell lines of the c.7041+8G/A heterozygote carrier, as well as nine wild-type individuals, were used to perform fluorescent-based real-time PCR quantification using the Light Cycler Real-Time PCR apparatus (Roche Inc, Nutley, NJ). The primer sequences were;
ATR wild-type: forward primer: 5'- GTCATATACACTCCCTTTTCTTTA-3', reverse primer: 5'-GTCATATACACTCCCTTTTCTTTA-3';
ATR splice transcript: forward primer: 5'- ACCATTTACTTTGTCTCCATTA-3', reverse primer: 5'-GTCATATACACTCCCTTTTCTTTA-3'. Expression analyses were then carried out as previously described [
69].
Discussion
Since it is well established that the residual familial risk of breast cancer, not caused by
BRCA1 or
BRCA2 genes, could be explained by a polygenic or high-risk genes heterogeneity model [
72,
73], we selected individuals affected with breast cancer without mutations in
BRCA1/2 genes from high-risk families (one individual per family), in order to increase the power of the study to find genetic variants involved in breast cancer susceptibility. So far, several genes have been investigated based on their interaction with
BRCA1/2 or their involvement in DNA repair mechanisms. Since
BRCA1/2 genes are intimately linked to genomic stability, other genes involved in this pathway are very good candidates to be
BRCA3, and this is especially true of
ATM and
ATR which play a central role in genome stability maintenance. The
ATM gene has been suspected to be a breast cancer susceptibility locus, due to the presence of breast cancer in A-T families, particularly among
ATM heterozygotes [
74].
ATM mutations have already been reported to increase breast cancer susceptibility [
9,
27,
75], while some other sequence variants located in this gene do not seem to be linked to breast cancer [
24].
Based on the similar roles played by ATM and ATR as sensors of DNA damages,
ATR may be considered a putative candidate gene that could possibly explain a fraction of the remaining familial breast cancer risk. Association of
ATR germline mutation with breast cancer susceptibility has been previously analyzed in Finnish 126 families [
70], and no germline mutation was identified in this founder population. The current study, performed in a French Canadian cohort, also being a founder population, was designed to assess the possible involvement of
ATR germline deleterious mutations in breast cancer predisposition.
No deleterious germline mutation leading to a premature termination of the protein were identified in the coding region. However, 41 sequence variants were identified, among which 16 were coding variants while 21 were novel changes. In addition we find it unlikely that neither of the common missense substitutions located in the FAT and kinase domains (c.6394T>G and c.7274G>A) have a significant effect on protein function because: (i) their frequencies are similar in cases and controls, especially for c.7274G>A whose MAF is greater than 20% in controls and (ii) these residues are not well conserved in other species (Table
4). Indeed, the polymorphisms displaying a significant deviation from HWE are composed of a group of 14 uncommon polymorphisms identified in the same 3 breast cancer cases (2 homozygotes and 1 heterozygote), and therefore this most likely constitutes a single relatively rare allele. It has to be stated that no particular characteristics seem to emerge for the families bearing any of these rare variants, as both the French Canadian and the validation families have been recruited on the basis of high-risk breast cancer families.
Comparison of polymorphism frequencies between our cohort and the Finnish cohort [
70] is not fully informative since the latter does not distinguish the number of heterozygotes and homozygotes found in their cohort but only the number of carriers of a given polymorphism. However, if we also use this method to calculate polymorphism frequencies observed in our cohort, only SNP40 displayed a notably lower frequency than that found in the Finnish cohort. As stated earlier, both studies (Heikkinen
et al. and the present study) have been designed to identify
ATR deleterious germline mutations in breast cancer cases. No such mutation was found in either study, therefore
ATR is unlikely to play a major role as a high penetrance gene in breast cancer predisposition. Even though novel variants have been identified, the possible involvement of polymorphisms or haplotypes observed in cases compared to those found in controls would need a lot more individuals to obtain a significant value of association to breast cancer susceptibility [
76,
77]. We thus sought to identify tSNPs that could be useful to other studies and populations.
Our pairwise linkage disequilibrium analysis (Figure
2) did not seem to identify any distinct LD blocks within
ATR. This observation is supported by the fact that SNP1 is in perfect LD with most other SNPs, including the most distal SNP41, and is also in accordance with what is seen in the French Canadian founder population which displays large conserved haplotypes as reported at the
BRCA1 locus [
49]. However, using the Haploview software, three distinct LD blocks were identified at the
ATR locus when using SNPs showing a MAF >5% in healthy French Canadian individuals (Figure
3). The breakage of strong LD seems to be located in the region of exon 31, and between exon 43 and exon 47.
Based on the same algorithm (Haploview), and using the SNPs genotyped in HapMap database showing a MAF higher than 5%, two LD blocks could be identified; the first block comprising the SNPs located from intron 1 to exon 43, while the second block included all the remaining SNPs until exon 47. However, it should be noted that the majority of the SNPs used to determine haplotype blocks have a MAF higher than 0.4, which represent common SNPs found in many different populations and therefore probably exclude the SNPs specifically observed in our French Canadian founder population.
We were able to demonstrate that 8 tSNPs are sufficient to represent the majority of
ATR haplotypes in our French Canadian individuals, which will greatly facilitate subsequent studies. Our results of 8 tSNPs at the
ATR locus in our population is consistent with previously reported number of tSNPs required at other gene loci in other populations [
78,
79]. We can therefore be quite confident that these tSNPs will be useful in subsequent analyses. Moreover, out of 72 SNPs genotyped in the HapMap database (HapMap data rel#20 on NCBI B35 assembly, dbSNP b125) at the
ATR locus, only 40 displayed a MAF >5%. Among them, 7 tSNPs were identified, 3 of which have been identified as tSNPs in our analyses (rs10804682, rs2229032, rs1802904). Of the remaining four tSNPs identified in HapMap database (rs11920625, rs9856772, rs6805118 and rs9816736), three have not been genotyped in our cohort as they were located in intronic regions (>150–200 bp) and one (rs11920625) was not observed in our individuals.
Sequence analysis of exon 34 flanking intronic sequences revealed a deletion of 9 nucleotides + insertion T (c.5739-4del9+T), which is located 4 nucleotides upstream of this exon. In silico analysis showed that this deletion decreased the exon 34 acceptor site splicing score from 0.94 to 0.31, which suggests potential splicing alteration in this region.
Surprisingly, sequence analysis of this cDNA region in our immortalized cell lines revealed a deletion of the last 121 nucleotides of exon 33 instead of a skipping of exon 34 (or a portion of exon 34), as expected. This deletion of 121 nucleotides alters the ORF and results in a putative truncated protein of 1889 amino acids. Although interesting, this deletion is observed at similar levels in all tested individuals and is therefore unlikely related to the c.5739-4del9+T polymorphism. This Δ33 splice form may be explained by the weak wild-type donor site score of exon 33 (0.11) and the presence of an additional donor site located within exon 33, which exhibits a score of 0.63. No splice form involving the skipping of exon 34 has been identified when using specific primers located on the putative Δ33-35 or 33–35 exon junctions.
Splicing score analyses of exon 41 flanking intronic sequences were also analyzed since c.7041+8G/A could potentially affect the splicing in this region. While the exon 41 donor splice site showed a relative low splicing score in all species, the putative intronic donor site (splicing score of 0.90) located 441 nucleotides downstream of exon 41 became of interest, given its potential effect on splicing in this region (Figure
4). The insΔInt41 splice form could not be detected using standard procedures. However, this splice form has been amplified and subcloned by using specific primers located on this putative exon junction, demonstrating its very low mRNA expression (at the limit of detection). Due to this low expression, it was impossible to conclude whether or not this insΔInt41 splice form is associated with the c.7041+8G/A variant.
The ratio of Δ33 splice form/WT form being a potentially important factor regarding DNA repair and other related functions in genome stability, we performed QRT-PCR to estimate the relative abundance of WT and Δ33 splice form mRNAs, using TAQMAN probes to allow discrimination between both forms. No correlation was found between the presence of c.5739-4del9+T in either the heterozygous or homozygous state and the expression levels of the Δ33 splice form. However, it is very interesting that significant relative expression of the Δ33 splice mRNA is observed in breast and ovarian tissues, as well as in MCF7 and HaCat (human skin keratinocytes) cells (Figure
6A), especially since expression levels of the WT ATR form in these tissues (Figure
6B) seem to be relatively similar to other tissues and cell lines. The ratio of expression levels between both mRNAs could therefore be of primary issue regarding the effect of the balance of these transcript levels on cell integrity in different human tissues. However it should be noted that only one sample per tissue was analyzed, which by no means represents a mean expression in these tissues or cell lines. While alternative splicing within the non-catalytic domain of ATR mRNA transcript causing skipping of exon 6 had already been observed [
80], in 2003 O'Driscoll and coll. [
81] identified a founder mutation (2101A→G) in
ATR that affects exon 9 splicing in two related Pakistani families affected with Seckel syndrome. This study also shows an impaired response to DNA damage in a cell line from an affected parent who carried the mutation. Further characterization of ATR-Seckel cells showed impaired phosphorylation of ATR-dependent substrates, impaired G2/M checkpoint arrest and supernumery centrosomes in mitotic cells, clearly demonstrating a role for ATR in the maintenance of centrosome stability [
82]. More recently, two other splicing alterations of ATR have been reported in clinical samples with pyothorax-associated lymphoma [
83].
Competing interests
The author(s) declare that they have no competing interests.
Authors' contributions
FD and JS conceived and devised the overall strategy for this study and authored the final version of this manuscript. YL and PS performed all DNA sequence and database analyses and drafted the manuscript. Haplotype analyses were carried out by YL while RNA expression analyses were carried out by PS. OS and ST provided insightful comments and revisions of the final version of the text and carried out DNA sequencing of the patients of the validation group. DL provided DNA samples from healthy patients. PB, JC, RL, JP, BL, RP and MP are clinicians that were in charge of collecting blood samples form affected individuals in their respective institutions and have been highly involved throughout the whole recruitment process to result disclosure. All authors read and approved the final manuscript.