Introduction
The observation of activating mutations in the tyrosine kinase domain of the epidermal growth factor receptor (
EGFR) gene in ~10 % of all non-small cell lung cancers (NSCLC) has opened the possibility of targeted therapy with receptor tyrosine kinase inhibitors (TKIs) directed against mutant
EGFR [
1‐
3]. Since clinical phase III trials have demonstrated the benefit of TKI application for patients whose tumours harbour activating
EGFR mutations [
4,
5], mutation analysis of
EGFR is suggested to be routinely performed in NSCLC specimens [
6]. In contrast, activating mutations in the v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (
KRAS) gene, a member of the Ras family of small GTPases, were present in 27 % of NSCLCs (all adenocarcinomas) in a recently published sequencing study, and these two mutations occur mutually exclusive [
7]. Importantly, mutant
KRAS is located downstream in the signalling cascade of
EGFR and consequently associated with resistance to TKI therapy [
8]. Therefore, mutation analysis of both
EGFR and
KRAS is vital for individualised therapeutic decisions.
Several issues exist, however, which hamper employment of
EGFR mutation detection as a reliable diagnostic tool. First, significant discrepancy of
EGFR mutation frequencies (6.8–25.9 %) and, hence, reporting of
EGFR mutation status has been revealed in a recent inter-laboratory comparison in routinely processed NSCLC samples [
9]. This raises the question of methodical problems in this therapeutically relevant testing. Second, in patients with extensive disease (stage IV), only small biopsies or cytological specimens are usually available with limited amount of tumour cells. This may represent a serious obstacle for mutation detection by routinely used sequencing with Sanger chemistry. For example, a recent study has demonstrated that ~30 % of NSCLC specimens in a large clinical cohort contained less than 40 % tumour cells, the minimal threshold needed for a reliable detection of
EGFR mutations using Sanger sequencing [
10]. Other techniques frequently used for detection of
EGFR mutations are based on real-time polymerase chain reaction (PCR) or pyrosequencing methods, with several commercially available kits. However, while these methods have a better sensitivity of ~1–5 % compared to Sanger sequencing, they will not identify 5–10 % of the currently known
EGFR mutations according to their targeted approach. Collectively, these obstacles underscore the need for alternative analytical principles that achieve more accurate diagnostic results.
Next-generation sequencing techniques allow massively parallel, or deep, sequencing of target regions with >1,000 reads per sample, thereby enabling detection of mutations at much lower allele frequencies compared to Sanger sequencing. For example, 100 % of mutations were detected in clinical responders to TKI therapy by 454 massively parallel sequencing in a comparative study on 18
EGFR-mutated NSCLC specimens, compared to only 89 % and 67 % detection rates of mutations by pyrosequencing and Sanger sequencing, respectively [
11]. Despite an obviously lower detection limit, no systematic analysis of the sensitivity, reproducibility and specificity of 454 deep sequencing regarding
EGFR and
KRAS mutations analysis has been reported as yet. The unique possibility of detection of clinically relevant mutations at very low allele frequencies in the range of 1–10 % is associated with the risk of considering technical errors, which are introduced by DNA polymerase during amplicon library preparation or through base-calling process as low-frequency variants [
12]. Therefore, a reliable threshold for background variants is desirable for discrimination of noise and low-frequency variants.
Given the fact that clinical samples are almost exclusively available as formalin-fixed and paraffin-embedded (FFPE) tissue specimens with often low-quality DNA, a special procedure for amplicon library preparation is needed to maximize the number of informative patient specimens [
13]. Since complex PCR primers are commonly used for amplicon library preparation, which include 5´-overhangs of adapter sequences for binding to the DNA capture beads and barcode sequences for identity of different patient samples, the percentage of efficiently amplified DNA samples may be even lower. In the current study, we established a two-step DNA amplification protocol with subsequent 454 deep sequencing of
EGFR and
KRAS genes, which is capable of successful amplification of FFPE NSCLC samples with low DNA quality. We systematically evaluated its sensitivity, reproducibility and specificity and provided reliable thresholds for the lower detection limit of mutations (sensitivity), variation of allele frequencies (reproducibility) and background levels (specificity). We next applied this assay to re-evaluate clinical NSCLC samples with low tumour cell content (≤40 %) that were
EGFR wild type according to conventional Sanger sequencing and identified
EGFR mutations in a significant proportion of these cases. In summary, this study demonstrates the much higher sensitivity of the developed 454 deep sequencing assay compared to Sanger sequencing and strongly argues for its wide application in routine molecular diagnostic analysis of clinical FFPE NSCLC samples with low tumour content.
Materials and methods
Patient samples and cell culture
A total of 21 FFPE specimens were obtained from the Institutes of Pathology in Erlangen, Gera and Ingolstadt (Germany). The samples included cell block preparations from cytological specimens (pleural effusions, n = 4; fine needle aspirations, n = 3), small endoscopic biopsies (n = 11) and resection specimens (n = 3). Tumour specimens were inspected by a pathologist to estimate the tumour cell content and the histomorphological pattern.
The lung adenocarcinoma cell lines NCI-H1650 (EGFR exon 19 deletion E746_A750del), NCI-H1975 (EGFR exon 20 missense mutation T790M; EGFR exon 21 missense mutation L858R), NCI-H460 (KRAS exon 3 missense mutation Q61H), NCI-H1299 (EGFR and KRAS wildtype status) as well as the colorectal adenocarcinoma cell line HCT-116 (KRAS exon 2 missense mutation G13D) were purchased from the American Type Culture Collection (ATCC, USA). All the cells were grown in a medium consisting of 90 % Roswell Park Memorial Institute Medium 1640 supplemented with 300 mg/L l-glutamine (Invitrogen, Carlsbad, USA) and 10 % foetal bovine serum (Invitrogen).
DNA isolation and amplicon library preparation
DNA was extracted from the FFPE samples using the QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany) and from cell lines by DNeasy Blood & Tissue Kit (Qiagen) as suggested by the manufacturer. The concentration of DNA from the FFPE samples was measured in a ND-1000 spectrophotometer (Thermo Scientific, Wilmington, USA). Quantification of DNA from cell lines for preparation of a dilution series of mutated DNA in wild-type DNA was performed by using the Quant-iT PicoGreen dsDNA Assay kit (Invitrogen) and Synergy 2 Multi-Mode Microplate Reader (Biotek, Winooski, USA). These control DNA samples represented distinct percentages of mutant variants: 50 %, 10 %, 7.5 %, 5 %, 2.5 %, 1 % and 0.5 %.
A two-step amplification protocol (nested PCR) included a pre-amplification step with outer PCR primers followed by re-amplification of diluted amplicons by using fusion primers with inner template specific sequences (Table
1). Pre-amplification was carried out in 25-μL reactions that contained 50 ng of cell line DNA or a variable quantity (50–250 ng) of DNA from FFPE samples, 1.5 mM MgCl
2, 200 mM dNTP, 500 nM primers and 1 unit Phusion Hot Start Flex DNA polymerase (New England Biolabs, Ipswich, USA). An amplification programme was started by an initial activation of the enzyme at 98 °C for 30 s. The initial amplification cycle was denaturation at 98 °C for 10 s, annealing at 72 °C for 30 s and elongation at 72 °C for 30 s. Amplification was continued for ten cycles, reducing the annealing temperature by 1 °C each cycle, followed by 40 cycles of 10 s denaturation at 98 °C, 30 s annealing at 62 °C and 30 s elongation at 72 °C. PCR products were diluted 1:10
6 or 1:10
3 for
EGFR or
KRAS amplicons, respectively, and employed for re-amplification reaction. It was started by an initial activation of the enzyme at 98 °C for 30 s. Each amplification cycle included denaturation at 98 °C for 20 s, annealing and elongation at 72 °C for 40 s. This procedure was continued for 40 cycles. Negative control PCR reactions supplemented with equal amount of water instead of DNA were included for each amplicon on the same PCR plate. About 10 μL of each reaction was examined on 3 % agarose gels. Amplicons were purified by using Agencourt AMPure XP kit (Beckman Coulter, Beverly, USA), quantified by fluorometry in triplicates using the Quant-iT PicoGreen dsDNA Assay kit (Invitrogen) and Synergy 2 Multi-Mode Microplate Reader (Biotek) as directed by the manufacturers. Finally, the library was pooled in equimolar ratios, and the concentration was adjusted to 10
7 molecules/microlitre.
Table 1
Sequences of the PCR primers for two-step amplification protocol and 454 deep sequencing
Sequences of the template specific 3´-portion of the fusion PCR primersb
|
EGFR-ex18-F | TGGAGCCTCTTACACCCAGT | 179 |
EGFR-ex18-R | CCCCACCAGACCATGAGA | |
EGFR-ex19-F | CATGTGGCACCATCTCACA | 179 |
EGFR-ex19-R | CCACACAGCAAAGCAGAAAC | |
EGFR-ex20-F | CTCCAGGAAGCCTACGTGAT | 180 |
EGFR-ex20-R | CACACCAGTTGAGCAGGTACT | |
EGFR-ex21-F | CCTCACAGCAGGGTCTTCTC | 182 |
EGFR-ex21-R | TGCCTCCTTCTGCATGGTAT | |
KRAS-ex2-F | AAGGCCTGCTGAAAATGACT | 170 |
KRAS-ex2-R | AGAATGGTCCTGCACCAGTAA | |
KRAS-ex3-F | AAAGGTGCACTGTAATAATCCAGAC | 171 |
KRAS-ex3-R | AAAGAAAGCCCTCCCCAGT | |
Sequences of the outer PCR primers used for two-step library preparation |
EGFR-ex18-pre-amp-F | GCTGAGGTGACCCTTGTCTC | 246 |
EGFR-ex18-pre-amp-R | ACAGCTTGCAAGGACTCTGG | |
EGFR-ex19-pre-amp-F | GCTGGTAACATCCACCCAGA | 247 |
EGFR-ex19-pre-amp-R | GAGAAAAGGTGGGCCTGAG | |
EGFR-ex20-pre-amp-F | CACACTGACGTGCCTCTCC | 250 |
EGFR-ex20-pre-amp-R | TATCTCCCCTCCCCGTATCT | |
EGFR-ex21-pre-amp-F | GCAGAGCTTCTTCCCATGAT | 247 |
EGFR-ex21-pre-amp-R | GGAAAATGCTGGCTGACCTA | |
KRAS-ex2-pre-amp-F | TTAACCTTATGTGTGACATGTTCTAA | 262 |
KRAS-ex2-pre-amp-R | TCATGAAAATGGTCAGAGAAACC | |
KRAS-ex3-pre-amp-F | TCAAGTCCTTTGCCCATTTT | 253 |
KRAS-ex3-pre-amp-R | TGGCAAATACACAAAGAAAGC | |
Analysis by deep sequencing
Deep sequencing was performed using the GS Junior Titanium chemistry according to the standard protocols of Roche (Basel, Switzerland). A total of ~500,000 beads were loaded on the picotiter plate yielding a total of 101,109 high-quality reads per run on average and average coverage of 1,451 reads/amplicon. All reads were processed, aligned to the human reference sequences of EGFR and KRAS and analysed for mutation frequencies by using the Amplicon Variant Analyser software v. 2.5 from Roche.
Sanger sequencing
EGFR and
KRAS PCR products for direct sequencing with Sanger chemistry were amplified in 50 μL reactions that contained 50 ng cell line DNA or about 100–250 ng DNA isolated from FFPE tissue, 1.5 mM MgCl
2, 200 mM dNTP, 500 nM primers and 2.5 units Taq Polymerase S (Genaxxon BioScience GmbH, Ulm, Germany). Initial denaturation step was 94 °C for 3 min. Each amplification cycle included denaturation at 94 °C for 30 s, annealing at 70 °C for 30 s and elongation at 72 °C for 40 s. This process was continued for ten cycles, reducing the annealing temperature by 1 °C each cycle, followed by 30 cycles of 30 s denaturation at 94 °C, 30 s annealing at 60 °C and 40 s elongation at 72 °C. The amplicons were purified with the QIAquick PCR Purification kit (Qiagen) and sequenced bidirectionally at an external facility (Seqlab–Sequence Laboratories Göttingen GmbH, Göttingen, Germany). The sequencing data were visualised using the FinchTV 1.4.0 software (Geospiza, Inc., Seattle, USA;
http://www.geospiza.com). The sequences of the PCR and sequencing primers employed are listed in Supplementary Table
S1.