Introduction
Diffuse large B cell lymphoma (DLBCL) is a heterogeneous and aggressive lymphoid neoplasm, the treatment of which has significantly improved in the last decade with addition of the anti-CD20 monoclonal antibody rituximab (R) to the chemotherapy regimen consisting of cyclophosphamide, hydroxydaunorubicin, vincristine, and prednisone (R-CHOP) [
1]. Since then, R-CHOP has become a standard treatment for the vast majority of primary DLBCL cases with cure rates at about 60%. The remaining 40% of DLBCL either relapses after a period of remission or are refractory to the applied first-line therapy. Patients in this group are treated with aggressive salvage regimens supported by autologous stem cell transplantation (ASCT) [
2], but success rates are modest, particularly in primary refractory DLBCL [
3,
4]. Therefore, it is important to identify high-risk patients before administration of first-line therapy so that the potentially more aggressive tumors can be treated with alternative regimens [
5].
Risk stratification of DLBCL has relied for more than 20 years on the International Prognostic Index (IPI), which is based on evaluation of multiple clinical parameters [
6]. Following the addition of rituximab, IPI was revised (R-IPI) [
7] and, most recently, further enhanced (NCCN-IPI), to better identify high-risk patients [
8]. According to NCCN-IPI, the high-risk group has 5-year overall survival (OS) probability of 33% compared to 54% predicted by IPI, although this remains to be confirmed by the datasets of prospective trials.
In addition to patient-based factors, tumor-based prognostic markers have been proposed. It is known that ~10% of DLBCL cases have
MYC rearrangements that are strongly associated with worse outcomes, especially if linked to MYC protein overexpression [
9]. It has also been shown that the activated B cell (ABC) cell-of-origin (COO) DLBCL subtype has a worse outcome compared to the germinal center B cell (GCB) subtype [
10]. Because original classification based on gene expression profiling on a transcriptome level proved to be technically too challenging for implementation in clinical practice, surrogate immunohistochemistry-based algorithms were developed [
11]. However, their utility was limited by suboptimal concordance to the gene expression-based gold standard. Recently, a Lymph2Cx assay, which can measure expression of 20 genes and can be also applied on formalin-fixed paraffin-embedded (FFPE) material, was proposed for reproducible classification of DLBCL into COO subgroups [
12]. It remains to be seen if this new technology will be widely accepted in lymphoma centers worldwide. In addition to COO classification, immunohistochemical (IHC) studies have identified multiple protein markers, such as CD5, Ki-67, FOXP1, HLA-I, p21, and CD40, that prospectively showed prognostic value for R-CHOP-treated DLBCL [
13‐
17].
During the last decade, substantial progress has been made toward understanding the genetic basis of DLBCL [
18‐
20]. Shared and COO subtype-specific DNA lesions have been identified, converging into several most frequently dysregulated cellular pathways [
21]. Based on these findings, a handful of molecular prognostic markers have been identified, among which,
TP53,
FOXP1, and
MYD88 mutations and
CDKN2A deletions were associated with inferior outcomes in R-CHOP-treated DLBCL [
22‐
24]. The prognostic role of such molecular markers is likely to increase in the near future as high-throughput sequencing (HTS) enters routine practice in many institutions. However, more studies, particularly prospective analyses, are required to validate the existing molecular prognostic markers and to discover new ones that would add power to the existing prognostication algorithms.
In this study, we employed targeted HTS to identify somatic mutations in tumors of a well-documented prospective clinical cohort consisting of uniformly treated primary DLBCL patients. By correlating gene mutation status to the robust survival data, we aimed to discover new, and validate known, prognostic markers in DLBCL.
Methods
Study cohort
The clinical trial SAKK 38/07 (NCT00544219), active between 2007 and 2010, included 138 eligible patients with primary untreated DLBCL to prospectively determine the prognostic value of interim PET/CT scans by standardized treatment and evaluation criteria. The main clinical results have been published previously [
25].
Tissue specimens of patients who consented for additional translational research were used for a subsequent study investigating the prognostic value of phenotypic and genotypic profiles by IHC and fluorescence in situ hybridization (FISH), the results of which have been recently published in this journal [
13].
For the current study, FFPE tissues of 84 primary untreated de novo DLBCL patients with adequate amounts of remaining material were selected. Tumor content was determined by morphological evaluation and was at least 50% in all samples.
Updated clinical data (last follow-up on 31 January 2017) were used for evaluation of the prognostic role of genetic mutations. All patients in the study cohort were uniformly treated with six cycles of R-CHOP, followed by two cycles of R (R-CHOP-14). The primary endpoint was event-free survival (EFS) at 2 years (for definition, see the “
Statistical analysis” section), and the secondary endpoints were progression-free survival (PFS) and OS at 2 and 5 years as well as objective responses according to international criteria [
26].
DNA extraction and quantification
Genomic DNA was extracted with the GeneRead DNA FFPE kit (Qiagen, Nussloch, Germany) following manufacturers’ recommendations with minor modifications. Briefly, one to three 10–25 μm thick tissue sections were deparaffinized by several xylene washes, rehydrated and digested with proteinase K overnight at 56 °C in a shaking heat block. Following digestion, the samples were incubated for 1 h at 90 °C to reverse fixation-induced DNA crosslinks and inactivate proteinase K. Thereafter, uracil-N-glycosylase (UNG) enzyme was added to remove artificially (formalin) induced uracils and reduce the number of false-positive C > T transitions. After incubation at 37 °C for 1 h, the samples were loaded into a DNA purification column and were washed and eluted in 40 μl of nuclease-free water. DNA yields were quantified with the Qubit High sensitivity DNA assay (Life Technologies, Eugene, OR, USA).
Targeted HTS sequencing variant calling and filtering
A target enrichment panel was designed to cover mutational hotspots or all exons of genes most frequently mutated in B cell lymphoid neoplasms according to the COSMIC database (release v70) and manual review of the literature [
27]. Sequencing libraries were constructed exactly as described previously [
27]. One microliter of the prepared library was used for the Bioanalyzer High Sensitivity DNA assay (Agilent, USA) to confirm expected DNA fragment length distribution. Quantification was performed with the Ion Library Quantitation kit (Thermo Fisher Scientific, Carlsbad, CA, USA) following the original protocol. Libraries were diluted to 40 pM, loaded to the Ion540 sequencing chips by automated IonChef instrument, and sequenced with the Ion Torrent S5 XL machine (Thermo Fisher Scientific, Carlsbad, CA, USA). The depth of coverage, coverage uniformity, and number of variants called per sample are summarized in Additional file
1: Table S1. Mutation identification was performed by the Variant caller plug-in v5.0 of the Torrent Suite (Thermo Fisher Scientific) using default low stringency parameters for somatic mutation calling. Mutations were annotated using the Ion Reporter variant annotation workflow v5.0 and dbNSFP v3.0 database [
28]. MetaLR rank score was used to predict the functional impact of non-synonymous point mutations to the encoded protein [
29]. After annotation, the variants were subjected to additional, more stringent, and quality- and relevance-based filtering by criteria as detailed in Table
1. All variants with variant allelic frequency >5% were used in downstream analysis.
Table 1
Criteria used for mutation filtering (variant inclusion)
General quality |
Phred-based quality | >50 |
Strand bias | ≤0.75 |
Number of reads supporting called variant | ≥10 |
Functional relevance |
Variant allelic frequency | ≥5% |
Localization | Exonic and splice site |
Variant effect | Non-synonymous |
SNP exclusion |
Variant allelic frequency | <95% |
Database annotation and alternative allelic frequency (1000 genomes project, European descendent samples) | Not listed in dbSNP v138 or listed, but MAF ≤0.01% |
Variants in detected in the control cohort of 23 non-tumoral samples from lymphoma patients | Not overlapping |
Finally, aligned BAM files were manually inspected at sites of all remaining variants to exclude false-positive mutations or other artifacts introduced during library preparation [
30].
Statistical analysis
All statistical analyses were performed using the Statistical Package of Social Sciences (IBM SPSS version 22.0, Chicago, IL, USA) for Windows. EFS was calculated from registration to progressive disease or relapse, death of any cause, and initiation of any non-protocol anti-cancer treatment because of lymphoma symptoms or need of concomitant radiotherapy. PFS was calculated from registration to progressive disease or relapse, and death of any cause. OS was calculated from registration to death. Patients not experiencing an event were censored at last follow-up. The survival probabilities were determined using the Kaplan–Meier method, and groups were compared using the log-rank test. Factors of prognostic significance in univariable models underwent multivariable analysis using the Cox proportional hazards model. For other endpoints, differences between groups were tested either with t test, Wilcoxon rank-sum test, or Fisher’s exact test, as appropriate. In all tests, p values are two-sided, considered significant if <0.05, and not corrected for multiple testing. For survival analysis within the COO subgroups, p values were corrected for multiple testing and were considered significant if <0.017.
Discussion
There is a critical need for additional tumor-based prognostic markers in DLBCL to identify high-risk DLBCL patients prior to first treatment that might benefit from alternative risk-adjusted therapies. Tumor mutations represent promising candidates for outcome prognostication due to the unequivocal nature of results obtained by respective mutational detection techniques and the increased availability and robustness of HTS. We have demonstrated that targeted mutational analysis of a relatively small but uniformly treated and prospectively followed up patient cohort can reveal significant prognostic associations in DLBCL. This was only possible (a lesson learned) because of the proper trial design contemplating central collection of tissue for translational analysis as an integral study part. Additional lessons learned from this prospective trial such as the importance of a central diagnostic pathology review as well as handling of biological entities and subentities in the spectrum of so-called high grade B cell lymphomas have been discussed in a paper of ours recently published in this journal [
13].
We found that deleterious mutations in two acetyltransferase genes,
CREBBP and
EP300, which belong to the KAT3 family of histone/protein lysine acetyltransferases, predict worse OS, PFS, and EFS in DLBCL independent of IPI and FACR. Point mutations or deletions of
CREBBP/EP300 reportedly affect 39% of all DLBCL cases [
31]. In line with other studies, we confirm that
CREBBP mutations are more frequent than
EP300 and tend to occur more often in GCB–DLBCL [
18]. In the context of cancerogenesis,
CREBBP and
EP300 act as tumor suppressors. It has been shown that loss of one
CREBBP allele leads to reduced acetylation and inactivation of p53, impaired expression of glucocorticoid-receptor-responsive genes, and upregulation of BCL6 [
31,
32]. Finally, recent data suggest that heterozygous deleterious
CREBBP mutations lead to decreased global histone H3 lysine 14 (H3K14), K18, and K27 acetylation and reduced MHC class II expression (Hashwah et al., currently under peer review). Association with poorer outcomes were suggested in studies that found
CREBBP mutations in 20% of relapsed/refractory GCB–DLBCL [
33], and in a large proportion of relapsed acute lymphoblastic leukemia patients [
31,
34]. Despite this important clue, however, the prognostic value of
CREBBP/EP300 mutations in DLBCL has not been previously reported.
SOCS1 mutations predicted excellent PFS in our cohort, but differences in OS and EFS were not significant. Suppressor of cytokine signaling 1 (
SOCS1
) is a known inhibitor of JAK/STAT-dependent signal transduction, which binds to phosphorylated JAK and marks it for proteosomal degradation [
35,
36]. Mutations of
SOCS1 have been previously shown to be associated with favorable survival in DLBCL [
37] and are also frequently detected in Hodgkin lymphoma and primary mediastinal B cell lymphoma (PMBCL), both with a relatively favorable prognosis [
38]. We also previously showed that
SOCS1 mutations occurred exceptionally in non-relapsing primary DLBCL whereas being completely absent in relapsing DLBCL cases, supporting the association with favorable prognosis [
27]. Schif et al. reported that DLBCL bearing truncating
SOCS1 mutations have excellent OS, whereas those with only missense mutations have markedly worse prognosis [
37]. Our data, however, does not confirm such distinction: 6 of 19
SOCS1-mutated cases had only missense mutations, but their PFS was equally favorable as those with truncating mutations. The
SOCS1 mutational frequency in our cohort (28%) was higher than reported average (~13%) [
39]. All detected mutations had relatively high variant allelic frequencies (median 28%, range 6–78%); therefore, our higher mutation rate cannot be explained by comparably high sequencing depth and sensitive detection of subclonal mutations that could have been missed by exome-scale sequencing studies. A potential explanation might be bias of our cohort toward better than average survival, e.g., due to study protocol exclusion of patients with performance status >2 on the ECOG scale, with symptomatic central nervous system disease, or with HIV, and/or hepatitis infection, thus being potentially enriched for cases with better prognosis that consequently more often bear
SOCS1 mutations.
The prognostic value of multiple other gene mutations such as
TP53,
MYD88,
FOXP1, and
FOXP2 in DLBCL has been suggested previously [
22,
23,
40,
41]. In our cohort,
TP53-mutated cases had worse OS, but this difference did not reach statistical significance. Also, the low number (
n = 5) of
MYD88 L265P-mutated cases did not allow detection of any reliable prognostic associations. Analogously, despite
ATM mutations were consistently associated with worse outcomes in GCB–DLBCL in our study, this observation was based on a small number of events and cases bearing mutations and therefore remains to be validated on other larger collectives. While we were unable to identify mutations of
FOXP1, we previously reported that overexpression of the FOXP1 protein is associated with worse OS in the investigated DLBCL cohort [
13]. Here, we show that combination of HTS- and IHC-based prognostic markers (
CREBBP/EP300 mutations and FOXP1 overexpression) enables even better stratification between patients with good and worse prognosis.
It is unlikely that any single biomarker will significantly improve risk stratification in DLBCL due to the profound heterogeneity of this disease. This point is illustrated by 3 cases in our cohort with mutations of both
SOCS1 and
CREBBP (Fig.
2, cases UPN57, UPN112, and UPN115) that clinically behaved like
CREBBP-mutated cases and had worse prognosis. It is more likely that a combination of several different types of markers established on uniformly treated prospective cohorts and selected by robust statistical methods would provide models that could be prospectively validated on larger DLBCL collections. A good example of such an updated composite prognostic model is the m7-FLIPI, a recently developed score for follicular lymphoma, which incorporates mutations and clinical factors and provides superior prognostication compared to traditionally used clinical factor-based FLIPI [
42].
Acknowledgements
We are grateful to Prof. Anne Müller for her insights on the functional relevance of CREBBP/EP300 mutations and fruitful scientific discussions. Also, special thanks to Valeria Perrina and Sibylle Tschumi for their help in performing high-throughput sequencing.