Introduction
Gastric cancer is one of the leading causes of cancer death in China with 5-year overall survival (OS) that is barely over 30% [
1,
2]. The backbone therapeutic for advanced gastric cancer (AGC) patients remains to be multi-drug combination chemotherapy, which has shown efficacy that has largely plateaued [
3]. The median progression-free survival (PFS) of AGC patients is about 6–8 months in first-line setting. Despite progress made in understanding mechanisms that explain chemotherapy resistance of AGC using bioinformatics and multi-omics analysis technologies, biomarkers deemed to be able to identify patients who are more likely to respond to combination chemotherapy are surrounded by controversy [
4‐
6].
Options available to AGC patients as subsequent regimens after first-line treatment are limited [
7‐
9], where the efficacy of current treatment patterns is unsatisfactory. Under the pressure of chemotherapy, tumor cells may evolve to adapt to chemotherapy and acquire a resistant phenotype. Fitness trade-offs occur during this process, which could induce vulnerability to some other drugs. Thus, understanding the molecular evolution in place during the treatment may help to optimize the subsequent treatment. Re-biopsy for late-stage patients is normally difficult, and high inter- and intra-tumoral heterogeneity of gastric cancer suggests that genomic analysis based on tissue biopsy may hardly provide results that are representative of the molecular alterations of the tumor.
Analysis of circulating tumor DNA (ctDNA) using next generation sequencing (NGS) offers a mini-invasive method to avoid the intra-tumoral heterogeneity and to dynamically monitor the molecular alterations in place [
10‐
12]. The correlation between the genomic alterations of ctDNA and the clinical efficacy of the systemic treatment of gastric cancer has already been recognized in the literature [
13]. Changes of quantitative data including ctDNA concentration, mutant allele fraction, molecular tumor burden index which have been reported by several research groups can provide indications of cancer recurrence, overall survival, response to systemic therapy of gastrointestinal cancer patients [
12,
14,
15]. Somatic mutations of specific genes like
TGFBR2,
RHOA,
ERBB2 in ctDNA were found as potential biomarkers of the response to immunotherapy or HER2-trageted therapies [
16‐
18]. However, most of these studies focused on genomic alterations of ctDNA at a specific timepoint during treatment. The dynamic molecular changes and their potential clinical meaning during the sequential chemotherapy in AGC have not been fully delineated.
In the present study, peripheral blood samples were serially and prospectively collected from AGC patients who undergone treatments by sequential chemotherapy regimens. Plasma ctDNA were analyzed utilizing a customized NGS panel. Our aims were to identify the correlations between the genomic features of ctDNA and the patients’ outcomes and to profile the molecular evolutionary process of AGC patients during the sequential chemotherapy to further explore the potential therapeutic options available to these patients.
Discussion
In the present study, both baseline and dynamic changes of genomic features in ctDNA were found to be correlated with the treatment efficacy of first-line chemotherapy for AGC patients. Fluctuating changes in SNVs and CNVs could be observed during sequential chemotherapy, where the impact of the subsequent regimens on the molecular changes driven by first-line therapy was determined to be subtle. Under the pressure of conventional chemotherapy, novel SNVs and CNVs could be enriched into multiple common oncologic signaling pathways, where their distributions were established to be highly heterogenous among patients.
The correlation between the genomic features of ctDNA and the chemotherapy efficacy in AGC patients has been analyzed in previous studies, where most of the studies were focused on reporting the data of somatic mutations, including number of gene mutations and VAF [
12,
14,
20]. Recent more detailed analyses of ctDNA revealed that the molecular changes correlated with ant-HER2 targeted therapy and anti-PD-1 immunotherapy [
15,
16,
18]. Nevertheless, ctDNA analysis during conventional sequential chemotherapy—which is still the standard treatment pattern for most AGC patients in China—have not been fully explored.
We found that AGC patients who had higher baseline values of CNI and CNV burden in ctDNA were more likely to respond to platinum-based chemotherapy. Somatic CNVs were correlated with chromosome instability, which resulted from defects in mitosis and pre-mitotic replication stress [
21]. In AGC patients, it has been reported that high level chromosome instability was associated with sensitivity to platinum-based chemotherapy [
22]. In colorectal cancer patients, tumors with increased copy number alterations also indicated sensitivity to chemotherapy plus bevacizumab [
23]. Also, chromosome instability has been found to be associated with intrinsic resistance to taxanes [
24]. In our study, we found that both CNI and CNV burden in ctDNA recovered to their baseline levels at treatment failure of first-line regimens. The responses to paclitaxel-based and irinotecan-based regimens as second-line treatments were both found to be poor in our cohort. This was in line with the findings of previous randomized controlled trials [
9]. Our findings suggest that patients with high CNI and high CNV burden may not benefit from the standard pattern of subsequent chemotherapy of AGC.
Even though the values of CNI and CNV burden at disease progression were similar to those at baseline, treatment resistance still occurred. Novel SNVs and CNVs were observed in ctDNA at PD. Compared with somatic mutations, platinum-based chemotherapy could cause more gene-level CNVs in ctDNA. A pan-cancer study demonstrated the existence of a positive linear influence of CNVs on the expression of most of the genes, which was indicative of the presence of a direct effect of CNVs on gene transcriptional level [
25]. Furthermore, gene-level CNVs are considered as common and important molecular alterations in tumors and are usually the upstream changes that trigger the cascade of signaling chaos [
26]. In our study, KEGG analysis of novel CNVs showed that several oncologic and treatment-resistance pathways, including EGFR tyrosine kinase inhibitor resistance and platinum drug resistance, were enriched. These novel genomic alterations could be attributed to resistance to the first-line chemotherapy.
Tumor evolutionary theory provides an alternative strategy to optimize the current treatment paradigm of cancer [
27]. Spontaneous fitness trade-offs are the cost of evolution during tumor adaptation to the drastic external pressure of chemotherapy. These fitness trade-offs lead to the appearance of vulnerability in tumor cells to specific subsequent therapies [
28‐
30]. This phenomenon creates an opportunity for oncologists to anticipate tumor evolutionary process under the pressure of certain drugs and design an optimal subsequent regimen to prolong disease control [
31]. However, although we found that genes with novel SNVs and CNVs could be identified during and after the whole course of the treatment, the corresponding distributions among AGC patients were highly heterogeneous.
Molecular matched therapy has been performed based on molecular alterations detected in tumors. However, the benefit of this strategy in subsequent treatment of gastric cancer is uncertain. SHIVA trial did not show a significant benefit, and the absolute time of survival extension in VIKTORY trial was reported to be only about 2 months [
32,
33]. In our study, the genes involved in relevant signaling pathways were found to be highly heterogeneous among patients. It has been concluded that while common aberrant pathways might result in treatment failure, inter-patient heterogeneity and signaling crosstalk could lead to inefficacy of single-targeting molecular agents [
34]. Thus, a multi-targeting strategy should be employed in subsequent regimens of AGC patients. In EPOC1706 trial, lenvatinib plus pembrolizumab in the first-line or the second-line setting of ACG patients showed an objective response rate of 69%, which was much higher than that of previously-employed standard second-line chemotherapy of AGC patients [
35]. REGONIVO trial also showed a high response rate for the treatment of refractory AGC patients with regorafenib plus nivolumab [
36]. The synergistic effects in place between lenvatinib/regorafenib and immune checkpoint inhibitors such as pembrolizumab and nivolumab, together with the fact that both lenvatinib and regorafenib are multi-targeting tyrosine kinase inhibitors of VEGFR, FGFR, PDGFR, KIT and RET, should both account for such enhanced clinical efficacy[
37,
38].
Tumor cells can adapt to external treatment pressure by acquisition of new genomic or epigenomic changes [
39]. This means that the dynamics of tumor evolution should also be considered in devising therapeutic patterns for cancer. In a study on the genomic landscape of relapsed acute lymphoblastic leukemia, the prevalence of relapse-specific somatic alterations in patients showing very early relapse (< 9 months from diagnosis) was found to be significantly less than that in the patients who exhibited an early or late relapse [
40]. The median PFS of AGC patients in the first-line treatment is normally 6–8 months. Despite being for different kinds of tumors, these results suggest that gastric cancer cells might not have long enough time to acquire new driver somatic alterations under treatment pressure. In this context, Li et al. have also reported quite similar CNV profiles in paired tissue samples collected before and after neoadjuvant chemotherapy from gastric cancer patients who had no response [
41].
The present study bears certain limitations. First, the fact that only 30 patients were enrolled in our study made the sample size relatively small. Nevertheless, analysis of 108 ctDNA samples could present a representative view over the dynamic molecular changes during the treatment. Second, analysis of subsequent therapy was only performed in 10 patients. The fact that the available information found in the literature regarding the dynamic molecular changes during sequential chemotherapy of AGC patients is limited draws attention to the importance of the results of our study. Third, the analysis of SNVs and gene-level CNVs in our study was not accompanied by the examination of the possible correlations between them and the gene expression. Thus, it is our aim to investigate the transcriptional data in future studies using a larger sample size.
In summary, baseline and dynamic changes of genomic features of ctDNA could be biomarkers to predict treatment efficacy of platinum-based first-line chemotherapy in AGC patients. After treatment with standard chemotherapy regimens, convergent oncologic pathway enrichment yet characterized by inter-patient heterogenous gene-level CNVs was identified, which provided valuable information for the optimization of the sequential treatment of AGC patients.
Materials and methods
Patients and samples
Peripheral blood samples were collected from gastric cancer patients who were treated in Department of Oncology, Ruijin Hospital and met the following criteria: (1) age over 18 years; (2) histologically confirmed HER2 negative stomach adenocarcinoma; (3) treatment naïve locally advanced, recurrent or metastasis disease; (4) adequate performance status and organ function to receive standard chemotherapy. Written informed consent was provided by all patients before collecting the peripheral blood samples. The protocol was approved by ethics committee of Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, People’s Republic of China.
Blood samples were collected at times of baseline, efficacy assessment and disease progression. Samples collected at treatment failure of one regimen were considered as baseline samples for the regimen that followed. Regimens with at least two samples during treatment were eligible for further analysis. Treatment efficacy was assessed by radiographic examination (contrast enhanced CT or MRI) following the routine clinical protocols every 8–9 weeks. Objective response was assessed according to the RECIST criteria (v1.1). Clinicopathological information of all the patients was recorded, and PFS and OS were followed up. The best objective response during first-line treatment was used to stratify patients into R and NR groups. During the whole course of treatment, treatment response at each sampling timepoint was compared with last timepoint according to RECIST criteria (v1.1). Tumor response of each timepoint was listed in Additional file
11: Table S5.
DNA extraction, sequencing and data processing
Plasma and blood cells were separated by centrifugation at 1600 g for 10 min. Cell-free DNA (cfDNA) was isolated from plasma using MagMAX
™ Cell-Free DNA Isolation Kit (Thermo Fisher Scientific, Waltham, MA, USA). Blood cell DNA was extracted from blood cells with TIANamp Blood DNA Kit (TIANGEN, Beijing, China). The concentration of DNA was measured by Qubit
® dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA), and the quality of DNA was assessed by Agilent 2100 BioAnalyzer (Agilent, USA). Genomic DNA was sheared into 150–200 base-pair (bp) fragments with Covaris M220 Focused-ultrasonicator Instrument (Covaris, Massachusetts, USA). Fragmented DNA was used to construct the library using KAPA Hyper Preparation Kit (Kapa Biosystems, USA) according to the manufacturer’s instruction. DNA was hybridized to one in-house panel (Genecast, Wuxi, China) covering 2.43 Mb of the human genome with 1632 genes (Additional file
12: Table S6). The captured library was sequenced on Illumina NovaSeq 6000 according to the manufacturer’s instruction, producing paired-end reads with the length of each end as 150 bp.
Preliminary sequencing results in BCL format were converted to FASTQ files using bcl2fastq (v2.20.0). FASTQ format reads were processed using Trimmomatic (v0.39) [
42] for adapter trimming and low-quality reads filtering. Processed reads were mapped to the reference genome (hg19) using BWA (0.7.17) [
43]. Mapped results were sorted and marked for duplications using Picard toolkit (version 2.1.0,
https://broadinstitute.github.io/picard/), and then were realigned using GATK (version 3.7) [
44]. The quality control data were shown in Additional file
13: Table S7. The average sequencing depth of CF and BC samples was > 1000 × and > 200 × , respectively. The distribution of fragment sizes for each sample is shown in Additional file
4: Figure S4.
Somatic SNV calling
SNVs were called via VarDict (v1.5.1) and FreeBayes (v1.2.0) from processed mapping results in pair of tumor and control samples, respectively. The candidate somatic mutations were annotated with ANNOVAR (2015Jun17) [
45] and then were filtered by ExAC (
http://exac.broadinstitute.org), gnomAD (
http://gnomad-sg.org/), COSMIC (
https://cancer.sanger.ac.uk/cosmic), dbSNP (
https://www.ncbi.nlm.nih.gov/snp/) databases. Nonsynonymous mutations among the exonic and splicing regions were kept for the final somatic mutation data set (Additional file
5: Figure S5). The filter parameters of SNV include the following criteria: (1) exclude mutations located in intergenic regions or intronic regions. (2) exclude blacklisted mutations. (3) exclude mutations with sequencing depth < 120. (4) exclude mutations with allele frequency < 1%. (5) exclude mutations with reads < 5. (6) exclude synonymous mutations. (7) exclude allele frequency ≥ 0.002 in the Exome Aggregation Consortum (ExAC) database. (8) exclude mutations also present in blood control samples and the frequency in tumor sample is less than five times frequency in blood control sample (Additional file
14: Table S8).
Somatic CNV calling
Taking blood cell sample as the control, somatic CNVs were called from plasma sample for each gene included in the captured panel using CNVkit (0.9.5.dev0) with batch mode [
46]. In the process of calling, biases of read depth, GC content and repetitive sequence were removed through normalization (Additional file
6: Figure S6). The filtering criterion of CNV is the number of capture region more than 5, and the threshold of gain and loss are copy ratio > 2.5 and copy ratio < 1.5, respectively. The CNV value was defined as log2 transformed ratio of normalized read depth on each gene between plasma and blood cell samples (Additional file
15: Table S9).
Calculation of tumor mutation burden (TMB)
TMB were determined by the nonsynonymous somatic mutations among the exonic and splicing regions. Alterations likely or known to be bona fide oncogenic drivers were excluded (Additional file
7: Figure S7). TMB per megabase were calculated with the total number of mutations divided by the total bases of the target panel with ≥ 500 bp coverage.
Definition of meanVAF and maxVAF
The maxVAF value equals to the maximum VAF of all somatic mutations in each sample. The meanVAF value was defined as the mean value of all mutations’ VAFs in each sample.
Calculation of CNV burden
CNV burden is equal to the total number of CNVs in each sample. Considering the two types of CNVs, the CNV burden was divided into gain and loss. The total CNV of gain and total CNV of loss were defined as the CNV gain burden and CNV loss burden, respectively.
Estimation of ctDNA content fraction (CCF)
CCF of plasma samples was estimated by a maximum likelihood model based on SNVs and CNVs in the paired blood cell and plasma samples. Somatic and germline SNPs that met the following criteria were used to build the model: (1) with a minimum depth of 50 × in the paired samples; (2) not on genes with high polymorphism; (3) no InDels in the 50 bp upstream or downstream regions; (4) not in a copy number gain region; (5) germline SNPs with significantly different variant allele frequencies (VAFs) in the paired samples, or somatic SNPs with VAFs significantly higher than background noise. These SNPs were defined as informative SNPs and were clustered into multiple groups according to their VAFs, local copy numbers and hypothetic genotypes. The hypothetic genotypes in cfDNA and ctDNA were determined by the VAFs in the paired samples and the copy number in the plasma sample. Each cluster represents a unique ctDNA source. We then calculated the likelihood of observing SNPs under given CCFs in each cluster. By maximizing the likelihood, CCF of each cluster could therefore be estimated. Cluster with the highest CCF was from the main source of ctDNA, and its CCF was the output of the final estimation.
Estimation of copy number instability (CNI)
After correction for GC content and length of target region using proprietary algorithms for each region, the read counts were transformed into log2 ratios and were converted into Z-score based on Gaussian transformations versus a normal control group (n = 30). The target regions that satisfied the Z-score greater than the 95th percentile plus twice-times absolute standard deviation of the normal control group were retained, and these Z-scores were summed as the CNI score.
Statistical analysis
For the comparison of continuous variables, we chose Wilcoxon rank sum test. When a continuous variable was transformed into a categorical one, the best cutoff was determined through pROC (1.18.0) package [
47]. Comparisons between categorical variables were conducted using Fisher’s exact test. Kaplan–Meier survival curves with log-rank test were generated by survminer (0.4.9) and survival packages (3.3–1). The visualization of gene alteration landscapes was profiled by ComplexHeatmap package (2.12.0) [
48]. The KEGG analysis of newly emerged gene alterations was performed on clusterProfiler package (v4.4.4) [
49]. We customized 1632 panel genes as background gene sets for pathways enrichment. All analyses were carried on R (4.2.1) program.
P < 0.05 was considered as statistically significant.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.