Introduction
Chimeric antigen receptor (CAR)-engineered T-cells have become an important cancer immunotherapy and the clinical application of these cells for the treatment of hematological cancers is increasing rapidly [
1‐
3]. For most CAR T-cell therapies, autologous T-cells are genetically modified to express a specific chimeric antigen receptor, which have high specificity and affinity for antigens expressed on the surface of target cells [
1‐
3]. Currently, the clinical production of CAR T-cells relies to a great extent on T-cell transduction using viral vectors, mainly γ-retroviral and lentiviral vectors, to deliver the engineered receptors of interest. Both γ-retroviruses and lentiviruses are members of the retroviridae family, which are characterized by their ability to retrotranscribe RNA genome into a cDNA copy and stably integrate into the host cell genome [
4,
5]. However, most of these integration events occur randomly. In some cases of CAR T-cell therapy, vector integration has resulted in a growth advantage which led to clonal CAR T-cell expansion and dominance [
6,
7]. CAR T-cells with viral vector integration events in the exons of TET2 and CBL have resulted in clonal expansion and complete disease remission [
6,
7]. While CAR T-cell therapy has proven to be relatively safe regarding long-term and unanticipated impacts of vector integration [
8,
9], serious adverse effects associated with vector integration have been observed in human gene-therapies targeting immune deficiencies. In clinical trials for X-linked Severe Combined Immunodeficiency (SCID-X), for instance, a therapeutic retroviral vector integrated near the LMO-2 proto-oncogene locus and caused leukemia-like illness [
10,
11]. While such leukemia-like illnesses have not occurred with clinical CAR T-cell therapy using standard transduction methodologies, the potential effects of these insertion sites on CAR T-cell safety and potency is still unclear. Thus, it is essential to comprehensively explore the precise viral vector integration sites and evaluate their potential effects on CAR T-cell safety and potency.
Next-generation sequencing and various cutting-edge biomolecular techniques have made it possible to identify viral vector integration sites across the entire genome. Several methods have been developed to investigate viral insertion events [
12‐
18]. These studies have reported that integration may occur in different chromosomes and regions of the human genome. Furthermore, vector integration preferentially occurs at fragile sites, transcriptionally active regions and those recurrently involved in translocation events [
19‐
23]. However, most of these studies have been limited to human immunodeficiency virus (HIV), human papillomavirus (HPV) or Murine leukemia virus (MLV) infected cells and little is known about the characteristics of viral integration events in CAR T-cells. In addition, the effects of viral integration events on the transcriptome of CAR T-cells and their association effects on clinical outcomes are not known.
Here, we explored the insertional sites of γ-retroviral and lentiviral vectors in clinical CAR T-cell products using a vector integration sites analysis (VISA) pipeline, which was modified from previously reported workflows [
12‐
15,
17,
18]. Combined with RNA-seq data and clinical outcomes, we further explored the insertional effects on gene expression and their association with clinical outcomes.
Materials and methods
Collection of CAR T-cell products
A total of 75 CAR T-cell and 6 TCR T-cell products were analyzed. 57 products were manufactured with lentiviral vectors: CD22-CAR T-cells (n = 41) [
3], CD19/CD22 CAR T-cells (n = 13), CD30-CAR T-cells (n = 2) [
24] and FGFR4-CAR T-cells (n = 1). 24 products were manufactured with γ-retroviral vectors: BCMA-CAR T-cells (n = 11) [
25], SLAMF7-CAR T-cells (n = 6) [
26], CD19-CAR T-cells (n = 1) and E7-TCR T-cells (n = 6) [
27]. All products (paired with non-transduced cells, which were cultured in the same manner) were sampled before they were infused into the patient. A cell pellet from the CAR T-cell product was collected and saved for DNA and RNA extraction. All subjects provided written consent.
DNA extraction and vector integration site analysis
Genomic DNA was extracted from pre-infusion CAR T-cell products according to Qiagen DNeasy Blood and Tissue Kit (Cat#69,506, Qiagen) and. Purity and concentration were measured using Nanodrop spectrophotometer (Thermo Fisher Scientific). 2.5 µg gDNA was used to construct pools of adaptor-ligated hDNA-vDNA fragment libraries according to the manual of Retro-X/Lenti-X Integration Site Analysis Kit (Cat#631,467, Cat#631,263, Takara). Briefly, genomic DNA from CAR T-cell products was random sheared with restriction enzyme DraІ, end-paired and ligated with GenomeWalker adaptors (supplied from Takara). hDNA-vDNA fragments were amplified with primers specific to the viral LTR and the adaptors from previous digested DNA using nested PCR. The hDNA-vDNA fragments generated with this method ranged from 100 to 2,000 bp in length, and contained the proviral long terminal repeat (LTR), the flanking genomic DNA and a linker adaptor. Highly purified hDNA-vDNA fragments were collected from nested PCR products using PCR clean-up kit (Cat#740609.250, Takara). 100ng of highly purified DNA were used to prepare dual-indexed paired-end sequencing libraries according to the Nextera™ DNA Flex Library Prep workflow (Illumina). Sixteen libraries were pooled together and were sequenced on a Miseq platform using a Miseq Reagent Kit v3 (600-cycle, Cat#MS-102-3003, Illumina). As a result of this procedure, the sequencing reads contain not only the genomic fragment needed for IS identification, but also viral and barcode sequences which was trimmed out (fastqc and trim-galore software) before alignment to the reference genome (bowtie2 and samtools software). Finally, aligned sequencing reads were annotated (deeptools/IGV/seqmonk software) to yield the final list of annotated viral integrated sites.
Calculation of vector copy number
A Bio-Rad laboratories Auto DG QX200™ ddPCR system was used for detection of vector copy number of CAR T-cell products (bulk cells, not CAR
+ T cells) and verification of integration sites in host genome. Detailed protocol was described in our previous publication [
28].
Transcriptome library preparation and sequencing
Total RNA was isolated from CAR T-cell products (transduced and non-transduced) using miRNeasy Mini Kit (Cat#217,084, Qiagen). Concentration and Quality were measured by Nanodrop 8000 (Thermo Fisher Scientific) and 2100 Bioanalyzer (Agilent). DNA libraries were performed using TruSeq Stranded Total RNA kit (Cat#20,020,598, Illumina) according to its protocol and sequenced on Illumina Nextseq 550 platform.
Transcriptome data analysis
Raw fastq files were filtered by FastQC for quality control and processed with Trimmomatic to exclude adapter sequences and low-quality reads. Filtered reads were aligned against the human reference genome (GENCODE hg38) using STAR aligner. Gene expression level is quantified using subread (featureCounts). Differential expression analysis is performed using the limma package in Rstudio with custom scripts. Wald’s test was used to calculate the adjusted p-value or significance that a gene is differentially expressed. Genes with | FoldChange | >=2 and adj.P.value < 0.05 were considered significantly expressed. Differential alternative splicing transcripts were quantified using the rMATS software in a STAR output bam file. To find statistically differential splicing events, the threshold (FDR < 0.05, and ΔPSI ≥ 0.2) was executed. Student’s t-Test was used to calculate p-value when compared numbers of differentiate alternative splicing transcripts in lentiviral products and γ-retroviral products. p > = 0.05 represents no statistical significance. rmats2sashimiplot program was used to produce a sashimiplot visualization of rMATS output.
Gene ontology analysis
Using the R package, clusterProfiler (version 3.0.4) [
29], gene ontology (GO) analysis was performed on the dataset. GO analysis varies from GSEA as it utilizes a different annotation set and accounts for gene length bias in detection of over/ under representation of genes. With the hg37 annotation set, we performed enrichment analysis on our set of differentially enriched genes. We utilized log2(FC) and the
p-value to determine significant genes for this analysis. Then we determined which GO terms were over or under-represented and visualized the data. We grouped the GO terms by biological process (BP), and selected the significant, over-represented terms. Adjusted
p-value was calculated by the built-in function using clusterProfiler package in Rstudio.
Principal component analysis
Principal component analysis (PCA) was conducted using factoextra package in R environment (version 3.6.1). Custom code was uploaded into GitHub public website as shown in our previous publication [
30].
Statistical analysis
All statistical analysis was performed with GraphPad Prism software and related R package. A p-value less than 0.05 was considered significant. Use of other statistic tests is indicated in each figure legend.
Discussion
Though retroviral vectors are widely used in manufacturing CAR T-cell products for immunotherapy, little is known about integration sites of γ-retroviral and lentiviral vectors in CAR T-cells. In this study, we effectively modified a reproducible pipeline to systematically monitor CAR T-cell viral vector integration sites, investigate insertional effects on host gene expression and to explore potential association between integration events with clinical outcomes.
We found both lentiviral and γ-retroviral vectors have their own specific integration patterns and integration hotspot loci. We found that γ-retroviral vectors were more likely to insert into promoter, utr, and exon regions compared with lentiviral vectors, while lentiviral vector integration sites were more likely to occur in intron and intergenic regions, when compared with γ-retroviral vectors.
We found that integration events affected gene expression at the transcriptional and post-transcriptional level. These integration sites could be affected by the characteristics of γ-retroviral CAR T-cell products, such as the proportion of CD4+ and CD8+ T-cells. More importantly, some genes were identified in CD22 CAR T-cell products that showed differential viral vector integration based on clinical outcomes.
Factors used as predictive markers of clinical outcomes to CAR T-cell immunotherapy have been reported in recent years. The nature of T-cell subsets and the expression of immune checkpoints expressed by T-cells before CAR T-cell manufacturing are factors described as influencing the efficiency of CAR T-cell products [
38‐
41]. CAR T-cell expansion and persistence have also been described as two potential markers of clinical response [
42,
43]. We found that clinical outcomes were associated with CAR T-cell differential vector integration events. Interestingly, in non-responders, more integration events were found in genes mainly involved in neutrophil activation, which may mediate immune suppression activity [
44,
45]. These results suggest that further study of differential integration events could reveal additional useful predictive markers for clinical outcomes, recognizing the limitation, however, that these analyses are rarely done in real-time and are often assessed after patients are treated.
Preferential integration has been previously reported in HIV/MLV infected cell lines and these studies found that lentiviral vector integrates primarily within bodies of actively transcribed genes and γ-retroviral vector preferentially target active promoters [
46,
47]. Given that viral vectors used in CAR T-cell manufacturing are genetically engineered and some contents differ when compared with the original viral sequence, it’s essential to explore vector integration patterns in CAR T-cell products. We found some difference in the distribution of integration in genomic structures in CAR T-cell products. Our results showed that lentiviral vectors mainly inserted into non-coding regions (intergenic and intron) and γ-retroviral vectors have a higher integration percentage in promoter, exon and untranslated regions when compared with lentiviral vectors.
To date, the question as to whether viral integration occurs randomly or not is still a matter of debate. The general viewpoint is that integration is semi-random which means viral integration at each genomic feature is not random, but random at each gene locus [
48,
49]. We found that some gene loci are hotspots for viral integration in CAR T-cells which indicates that these were non-random integration events. Interestingly, these hotspot genes differed significantly between lentiviral and γ-retroviral vectors. None of the previously published studies reported hotspots of viral integration at gene loci. Our data provides evidence that viral integration at gene loci are only semi-random. As to the question of why these gene loci were selected as hotspots, the main factors influencing integration is the sequence of the viral vector and genomic factors such as gene accessibility, GC content, and epigenetic modification.
We found that lentiviral and γ-retroviral vectors had their own distinct integration pattern and hotspots gene loci. The different pattern also reflected on transcriptional differences. CAR T-cell products made with γ-retroviral vector showed more differentially expressed genes and had more viral integration events at promoter and untranslated regions. Among these DEGs, we found that only 1% of CAR T-cell products had integration events at their promoter and utr region in those manufactured with γ-retroviral vector, while there were no integration events in these regions of CAR T-cell products manufactured with lentiviral vectors. Furthermore, we found that among the differentially expressed genes there were both up-regulated and down-regulated genes in the CAR T-cell manufactured with γ-retroviral (Fig. S3D and E). We speculate that the nature of the changes in gene expression depend on the relative orientation between integrated vector and gene promoter. Gene expression was enhanced when the orientation was the same as the cellular gene and reduced when the gene was in the opposing direction. In addition, we also showed that integration events could affect mRNA transcripts at post-transcriptional level by impairing alternative splicing when integration events altered exons and introns. Given the percentage of DEGs with viral integration events in gene loci, our findings indicated that most of integration events have little direct impact on gene expression. Most of DEGs without integration in gene loci could be the result of indirect regulation of integration events, which may insert into regulatory elements and impaired their regulatory role for gene expression [
11,
50].
Our study was limited in that we were unable to track the CAR T-cells post-infusion to determine if the cells expanded clonally due to we are unable to get patient samples after CAR T-cell infusion. A recent study has described the dominance of a single infused CAR T-cell clone in a single patient that was associated with integration into the TET2 gene [
6]. Another study found integration into CBL gene generated detectable expansion of CAR T-cell clones [
7]. Future research could include the possibility of studying long-term impacts of viral integration events.
In summary, we comprehensively explored viral vector integration sites in pre-infusion CAR T-cell products based on a modified VISA pipeline and verified the semi-random nature of viral integration into the genome. We found that individuals with different clinical outcomes showed a group of genes with differential enrichment of integration events and the function of these genes may be disrupted by interrupting amino acid sequence and generate abnormal proteins, instead of affecting mRNA expression. Whether viral vector integration sites correlated with clinical outcomes warrants further study. Most importantly, we found differences in integration patterns, insertion hotspots and effects on gene expression vary between lentiviral and γ-retroviral vectors used in CAR T-cell products and established a foundation upon which we can conduct further analyses.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.