Background
Sequencing techniques have advanced, leading to broad genomic analyses of bladder cancer cohorts and enabling molecular subtyping. Subtyping of muscle-invasive urothelial bladder cancer (MIBC) categorizes heterogenous cancers with similar molecular and biological characteristics, which has significantly contributed to our knowledge in the recent years [
1]. Several groups have simultaneously worked on molecular subtyping of different bladder cancer datasets coming to a description of two main types (luminal and basal), that can further be subclassified into 3–10 subtypes [
2‐
8]. Different nomenclatures, definitions, and numbers of molecular subtypes had hindered further prospective validation, and clinical translation until the description of a consensus classification. The molecular consensus classification used pooled mRNA expression profiles of 1750 fresh frozen and formalin-fixed, paraffin-embedded (FFPE) MIBC samples and identified six molecular classes: luminal papillary (LumP), luminal nonspecified (LumNS), luminal unstable (LumU), stroma-rich, basal/squamous (Ba/Sq), and neuroendocrine-like (NE-like) [
3].
The aim of classifying cancer in subgroups is to identify tumors that share similar prognosis and response to various therapies. In several reports molecular subtypes have been described as predictors of response to chemotherapy and immunotherapy [
5,
9‐
12]. However, the results are conflicting, and to date, the evidence is insufficient to use molecular subtyping or other gene expression signatures for the treatment decisions in patients with urothelial carcinoma. To facilitate the implementation of molecular subtyping into daily clinical routine, gene sets have been reduced to allow quantification with quantitative RT-qPCR, NanoString or immunohistochemistry panels [
13‐
20].
Many molecular profiling studies and molecular subtyping in The Cancer Genome Atlas (TCGA) are based on fresh frozen tissue, which allows high quality transcriptomic analyses based on RNA sequencing. However, fresh frozen samples are rarely available in clinical practice and for retrospective research projects. The use FFPE tissue is the gold-standard for pathological analyses and long-term storage in hospitals. The paraffin material is usually archived for 10 and more years, allowing correlation with long-term patient outcome. The disadvantage of FFPE tissue is that the RNA is highly degraded by fixation and storage, leading to sequencing artifacts and limiting detection of transcripts [
21,
22]. However, advances in sequencing techniques also enable molecular profiling of FFPE specimen [
23‐
25].
In this study, we tested and compared the feasibility of two RNA sequencing methods with FFPE tissues from MIBCs to determine uniform molecular subtyping. In addition, we used two reduced predefined gene sets to determine molecular subtypes and compared results with the comprehensive transcriptome data.
Methods
Patients and samples
Tumor samples were provided by the University Cancer Center Frankfurt (UCT). Written informed consent was obtained from all patients. The study was approved by the institutional review boards of the UCT and the ethical committee at the University Hospital Frankfurt (project-number: SUG-6–2018, UCT-53–2021), and conducted according to local and national regulations and to the Declaration of Helsinki. Fifteen FFPE samples from MIBC patients were obtained from the Dr. Senckenberg Institute of Pathology.
Immunohistochemistry (IHC)
Samples were stained for CK5/6 (Clone: D5/16 B4; Dako /Agilent, Santa Clara, CA, USA) and GATA3 (Clone: L50-823; Cell Marque, Rocklin, CA, USA) as described before [
19].
RNA-isolation
For each RNA isolation, a 1-mm punch was taken from FFPE blocks of an annotated, representative tumor area with at least 50% tumor content. RNA was either extracted using the truXTRAC FFPE total NA Kit (Covaris, Woburn, MA, USA) or by GenXPro GmbH.
HTG transcriptome panel (HTP)
The mRNA expression was determined using the HTP (HTG Molecular Diagnostics, Tuscon, AZ, USA) as describes before [
19,
26]. Briefly, target capture was done by the HTG EdgeSeq chemistry with nuclease protection probes on a 96-well plate. Processed samples were used to set up PCR reactions with specially designed primers, referred to as “tags”. These tags share common sequences that are complementary to 5’-end and 3’-end “wing” sequences of the probes and common adaptors required for cluster generation on an Illumina sequencing platform. In addition, each tag contains a unique barcode that is used for sample identification and multiplexing. The library was prepared using a PCR with OneTaq (New England Biolabs, Ipswich, MA, USA) and EdgeSeq PCR tag primers (HTG Molecular Diagnostics). Sequencing was performed on the Illumina NextSeq 550 system (Illumina, San Diego, CA, USA) in accordance with manufacturer’s recommendations but also including two HTG custom sequencing primers. The sequencing data on mRNA expression of target genes were imported into HTG EdgeSeq Parser software for alignment of FASTQ files to the to the probe list and quantification of the reads. The HTG EdgeSeq Reveal Application was utilized to quality check and normalize data. Gene counts were normalized using CPM and median normalization and log2-transformed for further analysis.
MACE Seq
Massive Analysis of cDNA (MACE) is a 3’mRNA sequencing method based on the analysis of Illumina reads derived from fragments that originate from 3’ mRNA ends [
24,
27]. Samples were prepared by GenXPro GmbH (Frankfurt, Germany) using the MACE-Kit V2 according to the manual of the manufacturer (GenXPro GmbH). RNA was fragmented and polyadenylated mRNA was enriched by poly-A specific reverse transcription, a specific adapter was integrated at the 5’ ends, and the products were amplified by competitive PCR and sequenced on an Illumina NextSeq 500 instrument. Duplicate reads as determined by the implemented unique molecular identifiers (TrueQuant IDs) were removed from the raw dataset. Low quality sequence-bases were removed by the software Cutadapt (
https://github.com/marcelm/cutadapt/) and poly(A)-tails were clipped by an in-house Python-Script. The reads were mapped to the human genome (hg38) and transcripts were quantified by HTSeq [
28].
Molecular subtyping
Molecular consensus classes of MIBC were assigned using the consensusMIBC package for R for the nearest-centroid transcriptomic classifier (
https://github.com/cit-bioinfo/consensusMIBC), TCGA classes were assigned using the BLCAsubtyping package (
https://github.com/cit-bioinfo/BLCAsubtyping) as described by Kamoun et al. [
3]. The minimal threshold for best Pearson’s correlation was set to 0.2. Normalized and log2-transformed gene expression values were used. Retrieved data include the consensus class, the Pearson’s correlation coefficient between each sample and each consensus class, the
p-values associated to the Pearson’s correlation of the samples with the nearest centroid (correlation
p-value) and the separation level.
For comparison, we used the included example data set of the TCGA bladder cancer cohort, which was created from fresh tumor specimen [
2].
We reduced the gene set input according to two proposed panels for bladder cancer subtyping (Table S
1 – S
2) [
16,
20]. The “ESSEN1”-panel is a 68-gene set covering tumor and stromal signatures [
16]. The above panel was further optimized and condensed to a set of 48 genes (called “ESSEN2” in the present study) [
20].
The heatmap was constructed with the open-source Morpheus software (
https://software.broadinstitute.org/morpheus/) using log2-transformed normalized gene expression values. For hierarchical clustering we used Pearson’s correlation metric with complete linkage.
Discussion
Molecular analyses of bladder cancer specimens are emerging and provide elementary information to address bladder cancer research questions. One of the major goals of bladder cancer research is to identify subtypes with different sensitivity to systemic therapies such as chemotherapy, immune-checkpoint inhibition, and further targeted treatments. Although molecular subtype classification of bladder cancer has not yet been incorporated into therapeutic decision making, robust methods are important to achieve progress in clinical translation and validation and to improve reproducibility. FFPE samples represent snapshots of the histology and biological information of a tumor at the time of collection, while the patient is being treated and outcome data can be generated. It is important to use the information, that is always collected and stored as FFPE and to overcome limitations of degraded RNA.
In this study, we used FFPE-isolated RNA to determine molecular consensus subtypes using two different sequencing techniques. The overall agreement between molecular subtypes was high for both techniques, although different RNA, library preparation and sequencing facilities were used. We validated two reduced gene sets to determine molecular subtypes with high accuracy, that can be used as a panel-based method at lower cost, which is an important step to introduce subtyping into routine practice.
Efforts are being made to methodologically simplify subtype classification by using reduced gene sets to enable its implementation in clinical practice. Stratifying patients will be important to select patients, that respond to chemo- or immunotherapy to reduce unnecessary toxicities and costs, as soon as prospective and validated evidence is provided. In a previous study, a 47-gene panel (BASE47) was proposed for the discrimination between luminal and basal subtypes using the NanoString platform [
14]. Recently, the ESSEN1 (
n = 68) and ESSEN2 (
n = 48) gene panels were developed to discriminate between 5–6 gene expression-based molecular subtypes according to different classification systems (e.g. TCGA, Consensus, Lund etc.) by using the qRT-PCR method on fresh frozen samples and the NanoString technique on FFPE samples, respectively [
16,
20]. These reduced gene sets still allowed the classifier to designate molecular consensus molecular subtypes in all except for one case, which had a lymphoepithelial histological subtype. Our results show high accuracy of 85%-100% of both reduced panels to call consensus molecular subtypes, compared to the comprehensive transcriptomic data. Discordant subtypes were observed between the stroma-rich and the Ba/Sq or the LumNS subtype and between the different luminal subtypes. Divergent subtype calls can be the result of either differences in gene expression or the composition of genes used for calculation. According to our data, the reduced ESSEN2 panel showed an even higher overlap with the comprehensive transcriptome data than the ESSEN1 panel. Thus, the selection of genes might be more important for the classification than just the quantity.
So far as no use-case for the application of molecular subtypes exists, a comprehensive gene expression analysis provides additional information on further genes and enables the analysis of additional immune or stroma signatures, which could find application as predictive markers [
30‐
32].
Limitations include the small number of samples. With only one sample for some of the subtypes general conclusions on which method is superior cannot be drawn. Furthermore, we did not perform a direct comparison between FFPE and fresh tissue. Our study lacks the comparison of HTP and MACE to further sequencing techniques used in previous molecular subtyping studies, like Illumina RNA seq or Affymetrix. Most importantly, sequencing was not performed with the exact same RNA, but RNA from neighboring tumor areas. Thus, discordant results can be the result of slight differences in tumor and stroma cell content or differences in deeper tissue layers not represented on the slide. However, this issue might reflect the heterogeneity of bladder cancer itself. This becomes particularly evident with the stroma-rich molecular subtype, since it was present in all divergent classified samples and highlights problems affecting bulk-RNA sequencing of bladder cancer samples in general. Molecular subtyping is performed on RNA derived from tissue (cores) of biopsy specimens representing only a fraction of the tumor. The subtype determined from the isolated RNA might not be the only and/or predominant subtype of a tumor, which is known to be heterogenous, especially in bladder cancer [
17,
33]. Furthermore, the subtype assigned depends on the cellular components, of which the RNA is extracted and the proportion of stromal content will influence, if a tumor is called as stroma-rich or infiltrated [
17,
31,
34,
35].
Conclusion
The consensus molecular subtypes represent a robust classification and can be determined based on comprehensive or selected FFPE transcriptome data. Using the data of matching pairs, the agreement of subtypes was high, but differences were observed when the stroma-rich molecular subtype was involved. Based on our results, it seems important to further unravel the heterogeneity of bladder cancer and the influence of stromal components on molecular subtypes to reduce sampling bias and allow more accurate assignment.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.