Background
There is a great unmet need for better treatment and diagnosis of kidney cancer, which remains the most lethal of all genitourinary malignancies. Five-year survival in renal cell cancer (RCC) is approximately 40% overall, 10% in metastatic disease [
1,
2]. Clear cell RCC (ccRCC) represents about 80% of cases, and around one-third of patients present with metastasis. Current risk stratification of advanced ccRCC uses clinico-pathological scoring systems, for example, the International Metastatic Database Consortium (IMDC) [
3] and Memorial Sloan Kettering Cancer Center (MSKCC) [
4] scores. Molecular markers promise to overcome the performance plateau encountered by clinico-pathological variables; however, success rates have historically been low [
5‐
8].
Sunitinib is a first-line treatment for metastatic ccRCC (mccRCC), doubling median progression-free survival compared with older immunotherapies such as IL-2 and interferon-α [
9,
10]. Sunitinib targets tumour, endothelial cells and pericytes, where the mechanism of action includes competitive inhibition of multiple receptor tyrosine kinases (RTKs) [
11,
12]. Up to 70% of patients treated with sunitinib show little or no tumour response [
10], although they may derive a survival benefit, despite incurring significant toxicity. Improved algorithms are critically needed to guide treatment decisions for current and emerging modalities [
6,
7,
13].
Advances in prediction of treatment response and prognostication may be severely hindered by intratumoural heterogeneity (ITH) [
14‐
16]. Indeed, percutaneous biopsy of mccRCC is a poor guide for pathological assessment of prognostic features [
17]. Development of tumour sampling approaches to capture ITH is key for discovery and validation of candidate molecular risk stratification algorithms [
6,
7,
13,
15]. We studied protein expression ITH in the context of mccRCC risk stratification, controlling for clinical variables, and developed a novel prognostic model (NEAT, for N-cadherin, EPCAM, Age, mTOR) that compares well with established clinico-pathological scores. The variables selected in NEAT inform mccRCC biology and suggest sunitinib action directly on tumour growth signalling. We quantitatively show a dramatic effect of tumour sampling on NEAT performance in a validation cohort receiving current standard treatment and demonstrate parameters pertinent to the development of molecular diagnostic tools for cancer medicine. We present recommendations that guide tumour sample selection for biomarker research in order to overcome variability in the presence of ITH. Indeed, sampling protocols may determine the success or failure of attempts to validate molecular biomarkers where ITH is a factor.
Discussion
This study examines the effect of sampling on the performance of a novel molecular prognostic approach, NEAT, using protein measurements from 183 regions across 44 mccRCC tumours. The unique development cohort from the SuMR trial allowed for selection of proteins that had increased intratumoural expression variance with treatment; we hypothesised that these proteins may be markers of aggressiveness and therefore useful in prognostication. Although the cohorts are relatively small, NEAT gave statistically robust stratification of the independent validation cohort by overall survival (Fig.
3a). The trend for favourable NEAT performance relative to the IMDC, MSKCC scores would benefit from investigation in a larger cohort, and the good performance of IMDC relative to the MSKCC score aligns with previous work [
3]. To our knowledge, the mccRCC cohorts analysed here are the largest available with RPPA data from pathologist-guided, multiregion tumour sampling. Our approach to capture grade diversity is likely to better represent ITH than standard sampling methods. Furthermore, each sample analysed by RPPA reflects a large tissue volume (
circa 50–75 mm
3) relative to standard approaches based on tissue sections from formalin-fixed paraffin embedded material such as tissue microarray analysis (<0.2 mm
3 per region). Therefore, the RPPA data analysed cover a higher proportion of the overall tumour volume relative to standard approaches. The sampling approaches may be an important enabling factor in NEAT reproducibility and hence good validation performance, despite the relatively small cohorts studied. The RPPA technique offers potential as a quantitative alternative to IHC and has already been applied in a clinical setting through the Clinical Laboratory Improvement Amendments (CLIA) facility certification process [
36,
37]. The NEAT model might ultimately be applied to inform decision making and patient management in several areas: (1) monitoring and follow-up, (2) recruitment into clinical trials with new agents, (3) treatment decisions, for example, for patients on the borderline of receiving drug due to other factors and (4) patient counselling.
The NEAT development and validation cohorts were relatively small (
n = 44 total), which is associated with increased risk of type II error and wide confidence intervals on predictive performance. Cytoreductive nephrectomy is standard clinical practice, and the use of upfront tyrosine kinase inhibitor (TKI) treatment is variable, limiting recruitment of a uniform cohort (as was obtained from the SuMR clinical trial) for NEAT development. A further limiting factor on the size of cohorts in our study was the availability of appropriately consented fresh-frozen material with multiregion sampling and pathology assessment for RPPA analysis. Our approach to discover resistance biomarkers required multiregion sampling of tumour tissue from patients treated with upfront sunitinib in order to enable comparison of candidate marker variance in sunitinib-exposed and sunitinib-naïve material. Therefore, the cohorts received different treatment regimens and also had significant differences in some clinical characteristics. NEAT performed well on both cohorts despite these differences, and so might be broadly useful for prognostication of mccRCC. Further study of NEAT performance on an independent upfront sunitinib cohort would be of interest to further explore potential clinical utility, such as to inform decision making about performing a cytoreductive nephrectomy [
38].
Subsampling of the multiregion RPPA data showed that validation of the NEAT prognostic model was critically dependent on the number of samples analysed per tumour. Indeed, the model's performance in risk stratification improved significantly at each increase in the number of tumour regions analysed (Fig.
5a). These results therefore evidence the benefit of more extensive tumour sampling both for biomarker development and also in validation studies where the sampling protocol may contribute to a reported lack of reproducibility. The efficacy of even the most promising tissue-based biomarkers is diminished by ITH [
39], and identification of molecular predictors that are unaffected by ITH may be very challenging. Indeed, cancer biomarkers have historically suffered from a high attrition rate [
8]. The available data provided for subsampling analysis of one, two and three samples per tumour; however, analysis with the full dataset (median of four samples) performed best. In principle, even higher sampling rates may be beneficial; several patients where >3 samples were taken, reflecting larger tumours, show considerable variation in HR even when large numbers of samples are analysed (Fig.
5b). One patient where eight tumour regions were examined had substantial variation in NEAT HR even across subsets containing six samples. Therefore, the influence of tumour sampling on predicted risk is clear for individual patients. These results also evidence benefit of sampling in proportion to tumour volume for molecular diagnostics. We found considerably greater variance in HR for low grade over high grade samples; thus, tumour biomarker studies would benefit from performing more extensive sampling of low grade regions. This result also underlines the additional information provided by NEAT. Indeed, the automatic feature selection process deprioritised grade relative to molecular variables. Prognostication using all of the multiple tumour samples gave better risk stratification than provided by analysis of any single sample in isolation. Therefore, NEAT analysis with multiple tumour regions captures information unavailable in any single sample; this information may reflect the adaptive potential arising from ITH [
40] and also might include aspects of disease progression such as the degree of vascularisation or the length of time since initial dissemination competence.
With regard to the individual components of the NEAT model, the positive association of mTOR with overall survival was the strongest, most significant feature and was also found in univariate analysis of an overlapping cohort. The mTOR pathway is an important mediator of RTK growth signalling [
41]. Improved prognosis associated with elevated mTOR in NEAT suggests that tumours dependent upon mTOR have enhanced sensitivity to sunitinib. Therefore, sunitinib may act directly on tumour cells to inhibit mccRCC growth, consistent with results in ovarian cancer that VEGF stimulates the mTOR pathway [
42]. Additionally, the mTORC1 complex, which includes mTOR, exerts negative feedback on RTKs to suppress proliferation and survival [
41]; this negative feedback could enhance therapeutic RTK inhibition by sunitinib. Notably, mTOR inhibitors are currently in clinical use (for example, everolimus), possibly in conjunction with sunitinib or similar agents. Our results suggest caution in co-treating with mTOR inhibitors and sunitinib, resonating with the poor performance of everolimus followed by sunitinib in the RECORD-3 trial [
43]. Consistent with previous results, for example [
44,
45], a significant negative association with survival was identified for N-cadherin, a canonical marker of epithelial to mesenchymal transition. Additionally, N-cadherin is expressed by endothelial cells and so may also represent a surrogate for vascularisation [
46]. Age is a known RCC prognostic factor that was not selected for the IMDC score [
3,
47,
48]. Our analysis took age as continuous values, which may partly explain selection of this variable for the NEAT model and not in the IMDC analysis, which dichotomised age at 60 years [
49]. The IMDC score was not selected by our machine learning approach which implies that, in the development cohort, prognostic information captured by the IMDC score overlaps with that provided by the NEAT variables. High EPCAM expression is also associated with poor prognosis in NEAT and multiple cancers [
50,
51], although reports link EPCAM with better prognosis in localised RCC; see, for example, [
52,
53]. The contrasting association with survival for EPCAM in NEAT may be due to differences between advanced and localised ccRCC, technologies used and context-specific function, for example, in signal transduction by nuclear localisation of the cleaved intracellular domain [
54].
Conclusions
Multiregion sampling to capture mccRCC grade diversity enabled investigation of ITH impact on risk stratification with a novel protein-based prognostic model, NEAT (N-Cadherin, EPCAM, Age, mTOR). NEAT compares well with established clinico-pathological scores on a geographically separate independent validation cohort who received current standard therapy. Results show that evaluation or attempted use of any molecular prognostic and predictive methods with few tumour samples will lead to variable performance and low reproducibility. We demonstrate parameters (tumour coverage, size, grade) that may be used to inform sampling in order to enhance biomarker reproducibility, and results underline the critical importance of addressing heterogeneity to realise the promise of molecular stratification approaches. Through studies such as TRACERx [
55], we anticipate that extensive multiregion sampling will become standard procedure for discovery and validation of molecular diagnostics across a range of cancer types.
Recommendations arising from our research include the following: (1) biomarker validation studies should implement tumour sampling protocols that match as closely as possible to the discovery work; (2) clinical biomarker research and ultimately front-line diagnostic approaches may benefit from greater tumour sampling rates; (3) clinical parameters (including tumour grade, size, coverage) can guide sample selection, and investigation of additional parameters to inform sampling may be useful; (4) optimisation of tumour sampling rate and sample selection protocols are important research areas to enable advances in stratified cancer medicine.
Acknowledgements
Thanks to Professor Fei Ye, Department of Biostatistics, Vanderbilt University, Nashville, TN, USA for critical reading of the manuscript.