Background
In recent years, personalised cancer medicine and the development of receptor tyrosine kinase inhibitors as well as monoclonal antibodies for targeted therapies led to dramatic improvements in cancer treatment and patient care. Nonetheless, most patients develop drug resistance and relapse after an initial treatment response [
1,
2]. Numerous studies have investigated the underlying mechanisms of drug resistance and showed, among others facts, that secondary mutations of the gene encoding the target protein are responsible for drug resistance [
3,
4]. The emergence of secondary gene mutations in a heterogeneous tumour population follows the Darwinian law. Thus far, it is not entirely understood if these mutations develop by means of mutagenesis during therapy or if secondary gene mutations are present in pre-existing minor subclones in a tumour subpopulation and are selected for during therapy [
5,
6]. Sensitive methods as well as mathematical models, like the Luria-Delbrück model, led to the identification of pre-existing resistant subclones prior to therapy in some tumour entities: In non-small cell lung cancer the
EGFR resistance mutation p.T790M and in colorectal carcinoma secondary
KRAS mutations down to a frequency of 0.01% [
7,
8]. In this study, primary and secondary gastrointestinal stromal tumours (GISTs) were analysed. 75 – 80% of GISTs are characterised by activating mutations in the
KIT gene [
9]. Primary unresectable or metastatic KIT positive GISTs are commonly treated with the receptor tyrosine kinase inhibitor imatinib (Glivec®, Novartis Pharma). After an initial treatment response, nearly half of the patients show tumour progression within two years [
10,
11]. The most common resistance mechanism is the acquisition of secondary resistance mutations in the
KIT gene [
11,
12]. It is still unknown whether the secondary resistance mutations pre-exist in minor subclones or develop “de novo” during therapy [
5,
11,
13-
15]. This study investigated, using the currently available ultrasensitive methods, if secondary
KIT mutations pre-exist in minor subclones in GISTs. For this approach, three massively parallel sequencing assays were used on the GS Junior (Roche, Mannheim, Germany) and on the MiSeq™ (Illumina, San Diego, CA, USA). The detection of pre-existing resistant subclones would be a crucial contribution to the choice of treatment course. Primary and secondary
KIT mutations could be targeted simultaneously by a combination of tyrosine kinase inhibitors. Thus, tumour growth and progression due to resistances could be prevented.
Discussion
The development of secondary resistance mutations during imatinib therapy is the most common resistance mechanism in GISTs. Experimental evidence of whether secondary mutations are pre-existing in minor subclones or develop “de novo” during therapy has yet to be provided and would help to develop new therapeutic strategies in GISTs.
In this study, 33 primary GISTs with known progressed disease and secondary resistance mutations were analysed on the GS Junior (Roche) and on the MiSeq™ (Illumina) with three different assays.
With an achieved sensitivity of 0.02% mutated alleles in the background of wild-type alleles for KIT exon 17 p.N822Y, p.N822K and p.Y823D mutations on the MiSeq™ (Illumina) with the AmpliSeq panel, no pre-existing subclones were detected with any of the three assays. The limit of detection varied between individual secondary mutations. Additionally, it could be seen that at each position of secondary mutations some negative samples (samples without later emerging secondary mutations) had higher allele frequencies than the samples with later emerging secondary mutations. Thus, the threshold used to distinguish positive from negative cases was determined for each position of secondary mutations by the allele frequencies of the negative samples, correlating with the limit of detection.
On both systems the sensitivity of the assay was limited by background noise. Particularly high background noise and artificial T > C substitutions at the position of the p.V654A mutation posed a problem and led to a higher detection limit. Artificial T > C transitions could be artefacts which are associated with formalin fixation and are a common problem in FFPE material, especially when using small biopsies and low DNA content [
23,
24]. Formalin cross-links cytosine nucleotides on either strand and/or deaminates cytosine to uracil and adenine to hypoxanthine. During PCR reaction the Taq polymerase incorporates an adenine instead of a guanine and a cytosine instead of a thymine and non-reproducible C<>T and G<>A mutations are created [
24-
26].
Forshew et al. showed in 47 FFPE samples that background frequencies of artificial substitutions were around 0.1% and varied depending on base substitution and loci [
27].
To reduce the effect of fixation artefacts and background noise three approaches were chosen: the sequencing depth was increased, fresh-frozen material was analysed and FFPE material was treated with uracil-N-glycosylase (UDG).
It is common knowledge that the detection of low mutated allele frequencies depends among others on the sequencing depth. Thus, an increase in the sequence coverage leads to an increase in the detection sensitivity of somatic variants by decreasing the background noise [
28-
32]. This effect was also seen in our study. However, in our study a much higher increase in the sequencing depth was achieved, which has not been published yet. In our study, this approach was first shown on the GS Junior (Roche). We increased the sequencing depth, and thus the method sensitivity, by sequencing 12 independent libraries with the same barcode of only one case on the GS Junior (Roche). By this approach, we not only increased the method sensitivity by increasing the sequencing depth, we also decreased amplification errors and thus the background noise by combining 12 independent PCR reactions. Here, we were able to increase the coverage from 828 to 48,087 and decrease the background noise from 1 to 0.4%, while at the same time decreasing the allele frequency at the position of the p.V654A mutation from an allele frequency of 0.85 to 0.16%. On the MiSeq™ (Illumina) we could observe the same effect of coverage increase and background noise decrease, when analysing the same sample at different coverages. Here, we used one PCR reaction per sample only.
Generally speaking, with the MiSeq an approximately 70-fold increase in sequencing depth led to an at least 3-fold decrease in the background noise. However, the principle described above could not be observed in all experiments. On the MiSeq™ (Illumina), the AmpliSeq panel showed in some amplicons a more than 10-fold increase in the sequencing depth in comparison to the Qiagen panel but a reduction of the background noise at the positions of the secondary mutations could not be observed an each position.
Thus, in our study, the reduction of background noise and increase in detection sensitivity by increasing the sequencing depth of the method led to the same results. No pre-existing secondary mutation exceeded the background noise (the allele frequency at the relevant position of the secondary mutations) in the primary tumour samples.
We analysed six fresh-frozen samples with the GS Junior (Roche) and three fresh-frozen samples with both MiSeq™ (Illumina) panels. With the GS Junior (Roche) no minor frequencies of mutated alleles were seen at four positions of secondary mutations (p.V654A, p.N680K, p.D820E, p.N822Y). With the MiSeq™ (Illumina) minor allele frequencies of the mutated allele were detected, but the frequencies and the sensitivity were the same as with the FFPE material and were thus determined as background noise.
Spencer et al. showed that most high-quality base discrepancies were not significantly different between FFPE und fresh-frozen material, and are rather due to sequencing errors and DNA damage. Only C > T and G > A transitions were significantly increased when comparing FFPE and fresh-frozen material [
33].
Nguyen et al. showed that transitions are especially prone to sequencing errors due to base-pairing and reading errors. They showed >1% erroneous sequences independent of the material source [
34]. Another study showed the presence of 0.05 – 1% sequencing errors with human cells and bacterial DNA [
35].
Additionally, 19 of the 33 primary GISTs were extracted with the GeneRead DNA FFPE KIT (Qiagen) and sequenced with the AmpliSeq panel. This kit uses UDG, which reduces C > T (and G > A) sequence artefacts [
26,
36]. Do et al. showed that UDG treatment reduces the allele frequency of G > A artefacts from 0.1 to 2.07% to 0.1 to 0.7%. However, as UDG removes uracil from damaged FFPE DNA only C > T and respectively G > A transitions are reduced. Therefore no reduction in T > C artefacts at the p.V654A position was seen.
At the positions of exon 14 and exon 17 substitutions the allele frequencies of the mutated allele and respectively the background noise were often as low as 0.02% on the MiSeq™ (Illumina). These substitutions were mostly transversions G<>T and T<>A, which are not affected by fixation artefacts or sequencing errors. Nevertheless, no minor resistant subclones could be detected at these positions.
Further, low-diversity libraries, i.e. libraries with only a few amplicons, may lead to an imbalance in sequence reads of the forward and reverse strand in MiSeq™ (Illumina) runs with normal cluster densities. Due to the low number of different amplicons, the likelihood of clusters of the same amplicon appearing next to each other on a flow cell is higher than in MiSeq™ (Illumina) runs with more diverse libraries. When analysing low-diversity libraries, the MiSeq™ (Illumina) cannot distinguish between the individual clusters and might detect the wrong nucleotide. As this reading error occurs in the two sequencing runs independently, it results in an imbalance between the two sequence reads and leads to the detection of false positives with a falsely higher allele frequency. To increase the run quality, it is stated that the cluster density should be decreased and that only balanced sequence reads should be analysed [
37-
39]. This approach was also applied in this study. To show the risk of imbalanced sequencing reads and false positives when using low-diversity libraries, one run showing imbalanced sequencing reads at the position of the secondary mutation in KIT exon 14 (p.N680K) was included in this paper. In this run, only cases without later emerging p.N680K mutation were included.
In addition to the massively parallel sequencing, a wild-type blocking LNA-mediated clamping assay (TIB Molbiol) for the p.V654A substitution was used in this study. With a sensitivity of 0.4% the assay yielded no other results than the massively parallel sequencing (data not shown). All samples were wild-type for p.V654A.
New large-scale sequencing approaches have revealed the extensive intra- and intertumour heterogeneity in many cancers [
40-
42]. In renal cancer 63 – 69% of mutations were not detectable in every tumour region [
40]. Therefore, the detection of subclonal mutations is important as these subclones may contribute to primary and acquired resistance [
43-
45].
This tumour heterogeneity and the development of polyclonal resistance mutations during therapy has also been described for GISTs [
5,
10,
11]. Wardelmann et al. showed that a biopsy is not representative for the whole tumour [
5,
11]. In our study, five of the 33 primary GISTs were segmented into a total of 52 subregions to minimise the analysed tumour region and reduce the wild-type background. However, this approach led to similar results and no minor resistant subclones could be detected prior to tyrosine kinase inhibitor therapy. It remains unresolved whether the detection limit of two mutated clones in 10,000 wild-type clones was not high enough, whether heterogeneous tissue samples are, per se, not suitable for the detection of very small subpopulations of mutated cells or whether in general no subclones were present.
The assessment of the probability of pre-existing resistant subclones is an ongoing challenge. In some tumour entities, pre-existing resistant subclones could be detected. In colorectal carcinoma
KRAS resistance mutations were detected with an allele frequency of 0.2%. In non-small cell lung cancer p.T790M
EGFR resistance mutations were detected with an allele frequency of 0.4 – 0.02% [
4,
7,
8]. These mutations were mainly detected with TaqMan assays, massively parallel sequencing approaches and mathematical modelling. The method sensitivity in our study was within the same range. However, in our study no pre-existing resistant subclones were detected. This is in concordance with published theories, which state that in GIST resistance mutations develop “de novo” during therapy as GIST patients with developing secondary resistance mutations are commonly treated longer with the tyrosine kinase inhibitor imatinib than resistant patients without these mutations [
14]. Hence, it is assumed that clonal selection of pre-existing resistance mutations in GIST is unlikely.
In the previous lung and colorectal carcinoma studies, mentioned above, pre-existing subclones were determined in blood samples and cell cultures.
Therefore, the analysis of circulating tumour DNA may be promising in the early detection of resistance mutations, which will overcome tissue heterogeneity and formalin fixation, and may also be useful in the detection of pre-existing resistant subclones [
46-
48].
Further, mathematical models have already been used and might be useful to predict pre-existing resistant minor subclones in combination with experimental and clinical data in GISTs [
15].
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
CH, SMB drafted the manuscript. CH, NK, MO, FH, EW, SMB conceived and designed the study design and the experiments. CH, NK performed the experiments. CH, NK, FH, JF, MAI, HK, AS were involved in data interpretation and analysis. RB, EW, HUS participated in the coordination of the study and helped drafting the manuscript. All authors read and approved the final manuscript.