Main

Use of biomarkers to prospectively define patient populations that respond to particular treatments remains the exception in oncology, with a few recent novel agents selecting patients by specific genetic aberrations in their tumours, such as gefitinib or crizotinib. Tumour samples are often unavailable for hypothesis-generating analyses that subsequently lead to identification of meaningful patient selection markers. Serum samples, which can easily be collected, were used in this study to identify potential prognostic/predictive markers in a large controlled data set. The broad analysis of >200 serum markers highlights that certain biomarkers may inform on likely outcome from chemotherapy.

5-Fluorouracil (5-FU)-based chemotherapy (FOLFOX or XELOX), with or without the vascular endothelial growth factor (VEGF)-signalling inhibitor bevacizumab, is currently the standard of care for patients with first-line metastatic colorectal cancer (mCRC) in many countries (Goldberg, 2004). While many patients benefit from therapy, some do not appear to show any improvement, and all ultimately progress. Despite multiple attempts, mainly focussed on VEGF and soluble VEGF receptor-2 (sVEGFR-2), molecular markers to define which patients respond to treatment have not been identified; therefore, patients are not presently selected for therapy. This also applies to treatment with FOLFOX and XELOX alone, which was the standard treatment approach before bevacizumab was registered (Goldberg, 2004). However, some hypotheses for markers to potentially identify patients responding to VEGF-signalling inhibitors have been generated in smaller studies in several tumour types (Tran et al, 2012; Hegde et al, 2013; Liu et al, 2013). The absence of identified markers for patient selection is probably because of the lack of large, high-quality sample sets from controlled clinical trials, which could support a broad analysis of multiple factors.

The recently published HORIZON II trial was a large randomised, double-blind phase III study comparing cediranib with placebo, in combination with chemotherapy, in patients with previously untreated mCRC (ClinicalTrials.gov identifier NCT00399035) (Hoff et al, 2012). Patients were initially randomised 1 : 1 : 1 to receive cediranib 20 mg day−1, cediranib 30 mg day−1 or placebo, each in combination with FOLFOX/XELOX (chemotherapy was selected by the investigator). An independent data monitoring committee conducted an end-of-phase II analysis of data from the HORIZON programme, which showed that cediranib 20 mg day−1 met all of the predefined criteria for continuation of the HORIZON II study; all subsequent patients were randomised 2 : 1 to the cediranib 20 mg or placebo treatment arms. Patients receiving the discontinued dose of cediranib (30 mg) were unblinded and given the option to continue with open-label cediranib (20 or 30 mg). Collection of serum samples in the HORIZON II study enabled an analysis of multiple protein markers, in patients randomised to the cediranib 20 mg and placebo arms, to explore whether potential prognostic or predictive factors could be identified in a relatively large sample set. As patients in the control arm were treated with FOLFOX/XELOX only, identification of potential markers associated with either disease progression or chemotherapy treatment was possible.

Cediranib is an oral VEGF receptor (VEGFR) tyrosine kinase inhibitor with activity against all three VEGFRs and c-Kit (Wedge et al, 2005; Drevs et al, 2007). During early-phase clinical evaluation, the combination of cediranib with mFOLFOX6 demonstrated encouraging antitumour efficacy in patients with mCRC (Chen et al, 2009). The HORIZON II trial is one of the two pivotal phase III studies of cediranib in first-line mCRC; the other trial, HORIZON III, compared the efficacy of cediranib plus FOLFOX6 with bevacizumab plus FOLFOX6 (Schmoll et al, 2012). In the HORIZON II trial, cediranib plus FOLFOX/XELOX showed a significant improvement in the co-primary end point of progression-free survival (PFS) compared with chemotherapy alone (hazard ratio (HR)=0.84, 95% confidence interval (CI) 0.73–0.98; P=0.012), but there was no improvement in the other co-primary end point of overall survival (OS) vs chemotherapy alone (HR=0.94, 95% CI 0.79–1.12; P=0.571) (Hoff et al, 2012). Pre-specified analyses did not identify any subgroups of patients with a better or worse outcome following cediranib treatment compared with the overall patient population (Hoff et al, 2012).

To identify potential prognostic and predictive biomarkers for response to FOLFOX/XELOX and the VEGF-signalling inhibitor cediranib in mCRC, a retrospective analysis of baseline levels of 207 serum proteins was performed using the large sample set collected from patients recruited to the cediranib 20 mg and placebo arms of HORIZON II. As patients in the control arm were treated with FOLFOX/XELOX only, it was not possible to distinguish between potential markers associated with either disease progression or chemotherapy treatment. The comparison between the treatment arms facilitated potential identification of markers that characterise patients who could derive a greater benefit from the addition of a VEGF-signalling inhibitor to chemotherapy.

Patients and methods

Patients and sample analyses

Serum samples, collected in EDTA tubes, were obtained from patients randomised in the HORIZON II trial (n=1076) (Hoff et al, 2012). Samples were selected from patients who had provided a baseline sample and at least two follow-up samples (weeks 6/7 and 12). Analyses of the on-treatment samples compared with baseline samples will be reported separately. Of the 858 patients who received randomised treatment in the cediranib 20 mg (n=500) or placebo (n=358) arms, 812 remained on the study for at least 12 weeks and were therefore eligible to provide samples. Samples were available from 582 (67.8%) patients who received FOLFOX or XELOX with cediranib 20 mg (n=330) or placebo (n=252) in HORIZON II. Blood samples from those patients randomised to the cediranib 30 mg arm in the HORIZON II study were not included, as this treatment arm was unblinded following an end-of-phase II analysis, which could potentially confound the OS results. The subset of patients used for the analysis of the biomarkers maintained the demographic characteristics of the main trial, as discussed in Results. Serum samples were collected from each site and stored at –70 °C at a central storage facility. Serum levels of 207 proteins were measured centrally at Rules Based Medicine (RBM, Austin, TX, USA) using proprietary multiplexed immunoassays. The analytes, including their median baseline levels, minimum/maximum values and standard deviation, are shown in Supplementary Table 1.

Statistical methods

The primary covariates used in the trial were World Health Organisation (WHO) performance status, chemotherapy received, baseline liver function and study phase; these were balanced between treatment arms and are described elsewhere (Hoff et al, 2012). Due to the reduced sample size, we chose to omit the primary covariates from the biomarker analysis models, other than the direct comparison made to data obtained in the primary analysis (shown in Figure 1). A comparison of the full HORIZON II analysis with and without the covariates showed very comparable HRs and CIs, and these were consistent with the same analyses in the reduced biomarker analysis set.

Figure 1
figure 1

Concordance of VEGF, sVEGFR-2 and CEA analyses using different methodologies. As a part of the primary HORIZON II analysis, VEGF, sVEGFR-2 and CEA levels were measured using plasma samples obtained from all patients. For the current analyses, biomarker levels were obtained using serum samples from a subset of patients participating in HORIZON II. BDS, biomarker data set; FDS, full data set.

Due to the large number of variables, error rates, which would be associated with the standard techniques for analysing prognostic and predictive merit, meant these techniques alone were insufficient. Generalised boosted regression models (GBMs) were therefore chosen as the primary analysis method (Friedman, 2001; Ridgeway, 2007). Following the application of the GBM approach to both PFS and OS, the rankings of markers were compared and markers of interest were identified as those with high rankings in both lists. In addition, a range of markers related to the mode of action of cediranib were also analysed further, independently of their identification in the initial statistical analysis.

Once markers were identified, a traditional statistical approach to assess the prognostic value (across both study arms) and predictive value (comparing sensitivity of patients on the cediranib arm with placebo) of each biomarker was used: the biomarker values were dichotomised into high and low categories based on the median baseline level for each marker. For determination of the prognostic value, a Cox proportional hazards model with a covariate for biomarker levels (high or low) was fitted to the model for each response variable (PFS and OS); treatment (cediranib 20 mg or placebo) was included in the model (to eliminate its effect in case of chance correlation with the marker). From this, the HR and 95% CI were estimated. For the determination of the predictive value, a Cox proportional hazards model was fitted to each biomarker group (high/low) with a term for treatment (cediranib 20 mg vs placebo). The HR and CI associated with the treatment were then compared with the overall treatment effect observed in those patients who were eligible to provide a sample (n=812). For treatment and biomarker comparisons, unless stated otherwise, HRs <1 favour cediranib 20 mg or a high baseline level, respectively. All analyses presented here were performed using R 2.13 (http://www.r-project.org/), with the GBM analysis performed using the gbm_2.0-2 package (http://cran.r-project.org/web/packages/gbm/index.html).

Results

Relevance of patient set

As samples were not available for all patients, the available data set (biomarker data set; BDS) was compared with the full data set (FDS) from the HORIZON II study to ensure that demography and treatment outcome were comparable between data sets. Age, sex and race (Black, Caucasian, Oriental or other) were compared along with the stratification covariates from the HORIZON II trial, namely, WHO performance status (0 vs 1 or 2), chemotherapy type (FOLFOX4, FOLFOX6 or XELOX), study phase (i.e., whether or not patients contributed to the end-of-phase II analysis from the HORIZON program), and liver function (alkaline phosphatase 320 U l−1 and albumin 35 g l−1 vs other). The FDS and the BDS showed comparable demographics. The efficacy analyses for the reduced data sets were comparable to the primary trial results, indicating that there was no bias with the BDS and that, where comparisons were made with the HR and CIs, they were reflective of the overall effect (data not shown).

Concordance of VEGF, sVEGFR-2, and carcinoembryonic antigen analyses using different methodologies

Of the 207 biomarkers analysed, three proteins (VEGF, sVEGFR-2 and carcinoembryonic antigen (CEA)) were also measured at baseline as a part of the primary HORIZON II study (values were split by median and correlated with outcome using the same methodology described above, but with the main trial covariates included) and have been reported (Jürgensmeier et al, 2013). VEGF and sVEGFR-2 were measured previously in plasma in a central laboratory (Labcorp, Research Triangle Park, NC, USA) using commercially available Quantikine ELISAs; CEA values were collected as per investigator’s analysis at site (Jürgensmeier et al, 2013). Figure 1 shows the statistical analyses based on the data obtained from plasma for the primary data set and serum for the BDS, along with CEA obtained locally for the primary data set and centrally for the BDS.

VEGF values ranged from 32 to 2760 pg ml−1 with a median of 124 pg ml−1 in the primary analysis, whereas values were 124–9330 pg ml−1 in the BDS with a median of 1420 pg ml−1. sVEGFR-2 values ranged from 5 to 13 ng ml−1 with a median of 11.6 ng ml−1 in the primary analysis, whereas values were 0.39–11 ng ml−1 in the BDS with a median of 5.7 ng ml−1. CEA values ranged from 0 to 21 276 ng ml−1 with a median of 40 ng ml−1 in the primary analysis, whereas values were 0.17–473 ng ml−1 in the BDS with a median of 33 ng ml−1.

Prognostic factors independent of treatment arm

The value of all 207 serum markers, split by the respective median, was correlated with outcome. Supplementary Tables 2 and 3 show the prognostic value of each biomarker by PFS and OS outcome, respectively. The markers identified in both analyses for which the 95% CI did not overlap with 1 are shown in Figure 2 as markers of potential interest.

Figure 2
figure 2

Prognostic biomarkers independent of treatment. Association between selected serum markers and treatment outcome (PFS and OS); all biomarkers for which the 95% CI does not overlap with 1 are presented. Biomarkers of potential interest for both PFS and OS are shown in bold.

Due to the large number of analytes, standard techniques for analysing the prognostic value of these markers were considered as insufficient. A novel method, GBM, using continuous values for each marker was applied (Friedman, 2001; Ridgeway, 2007). This approach orders the variables, in this case the biomarkers, by their relative importance. Markers are therefore ordered by the strength of their association with outcome compared with the association of the other markers. By fitting a model for PFS and one for OS, the top 30 markers from each model were taken to identify the markers common to both end points. A Cox proportional hazards model with the dichotomised biomarker variable (high vs low) was then fitted for each of these important markers. From these, the HR and associated 95% CI were estimated. Figure 3 shows the 15 markers that are common to the top 30 markers from each GBM (PFS and OS), ordered by their relative importance.

Figure 3
figure 3

Potential prognostic markers identified from GBMs. Potential prognostic value of the markers that were determined to have the strongest association with treatment outcome based on the GBM models (for both PFS and OS).

A number of biomarkers appear to be prognostic for improved PFS and OS outcomes across both treatment arms: C-reactive protein, cancer antigen 72-4, CEA, cellular fibronectin, insulin-like growth factor binding protein 1, ICAM-1, IL-6, IL-8, IL-18, mannose receptor C type 2, matrix metallopeptidase (MMP) 7, tenascin C and TIMP metallopeptidase inhibitor 1. Low baseline levels (as defined by the median) of these proteins appear to be associated with favourable PFS, and particularly OS, compared with high levels. By contrast, high levels of MMP9 and SOD in the GBM were correlated with favourable outcome. Kaplan–Meier curves for these markers are shown in Figure 4.

Figure 4
figure 4

Selected potential prognostic markers. Kaplan–Meier plots for selected biomarkers that appear to be prognostic for PFS and OS treatment outcome across both treatment arms. The median baseline level of each biomarker was used to determine the low and high subgroups.

In addition to the purely statistical approach to identify markers, specific markers within the analysed set, considered to be relevant to the mode of cediranib action, were explored further with regard to their prognostic relevance. These markers included proteins related to the targets of cediranib (VEGFs, VEGFRs, SCF and c-Kit), as well as additional proteins involved in angiogenesis. Figure 5 shows forest plots for the selected markers; Kaplan–Meier curves of these markers are shown in Supplementary Figure 1.

Figure 5
figure 5

Angiogenesis markers: prognostic across treatment arms. Association with treatment outcome (PFS and OS) for specific markers that are considered to be most relevant to the mode of cediranib action.

Interestingly, low levels of VEGF, VEGF-D, VEGFR-1 (Flt-1), VEGFR-3 (Flt-4), neuropilin and Tie-2 showed better PFS and OS outcomes, independent of treatment. High levels of VEGFR-2 (KDR) and c-Kit were seen in a group of patients with better OS outcomes, independent of treatment; however, this was not observed for PFS. By contrast, there was no correlation between outcome and a range of other proteins involved in angiogenesis: VEGF-B, VEGF-C, thrombospondin 1, PDGF-BB, PlGF, SCF, EGF, EGFR-1, Her-2, HGF, Ang-2, bFGF, endothelin 1, erythropoietin and endoglin.

Predictive factors for cediranib outcome

To evaluate whether any of the 207 proteins analysed could be used as predictive factors to identify which patients respond to cediranib treatment compared with the control arm, the data were analysed for each arm, splitting the values for each factor at the median. The full analysis is illustrated in Figure 6.

Figure 6
figure 6figure 6

Predictive value of markers at baseline. Association between serum markers and response to treatment with cediranib compared with chemotherapy alone. Data were analysed for both treatment arms. The median baseline level of each biomarker was used to determine the low and high subgroups.

For most markers analysed, the HRs for the low and high values were close to the overall HRs for PFS and OS; there were very few markers for which the CIs did not include the overall trial HR, and none of these few markers demonstrated consistent results between the PFS and OS analyses. The CIs for these markers all included the CI for the overall HR and, because there was no consistency between PFS and OS end points, there was no compelling evidence from this analysis to suggest that any of the biomarkers were predictive of a difference in cediranib treatment effect. GBMs were also used as an analysis method and did not identify any markers that were predictive of a better response to cediranib (data not shown).

Discussion

A lack of biomarkers that are able to inform which mCRC patients are more or less likely to benefit from FOLFOX/XELOX therapy alone, or in combination with a VEGF-signalling inhibitor, leaves patient selection challenging at present. Identification of biomarkers that predict which patients will respond, or gain a benefit, will be beneficial when prioritising treatment for the patients with this disease. To address this question, we used a comprehensive collection of samples from the HORIZON II trial (Hoff et al, 2012), which provided the largest sample set to date from a randomised study and enabled retrospective exploration of a broad range of serum biomarkers at baseline, and their relevance to treatment outcome and disease.

HORIZON II was a multicentre global phase III study with primary end points of PFS and OS and the biomarker analyses described here were exploratory and performed retrospectively. While laboratory manuals described the collection and storage of plasma and serum, the detailed adherence to the procedures was not monitored continuously at each site.

To ensure that conclusions could be drawn from this analysis of a subset of patients from the HORIZON II study, demography and outcome of the subset was compared with the full analysis set from the primary analyses. The data sets were shown to have comparable demographics; therefore, no sample bias is expected, which may have otherwise confounded any interpretation or generalisation. The HRs and associated CIs for the biomarker analysis set for PFS and OS were comparable to the primary trial results, indicating that the subset is representative of the full analysis set.

CEA, VEGF and sVEGFR-2 were all measured as a part of the primary analysis, in addition to being included in our multiplexed analysis. In the primary analysis, CEA was not measured centrally, but evaluated at the sites according to local practice (Jürgensmeier et al, 2013). The results from the new analysis of a subset of patients and a central analysis of CEA both concur with the original finding that CEA is strongly prognostic, but that CEA values do not predict which patients derive most benefit from the addition of cediranib to chemotherapy in mCRC. These data suggest that central measurement of CEA might not be required, but reported values from clinical centres can be used for future studies.

In the primary analysis, VEGF and sVEGFR-2 were measured from plasma samples (Jürgensmeier et al, 2013), whereas serum was used for the multiplex analysis described here. While total values for these markers, as expected, differ between plasma and serum, when using the respective median for each analysis, the correlations with PFS and OS showed very similar results. Levels of VEGF were prognostic in this setting for PFS and OS, independent of treatment. Baseline levels of neither marker predicted which patients would derive most benefit from cediranib treatment, when added to chemotherapy. This is in line with a recently published analysis of VEGF levels across multiple studies involving the VEGF-signalling inhibitor bevacizumab, in combination with a range of other agents, which showed that VEGF levels were prognostic but not predictive for patients that would derive most benefit from treatment with bevacizumab (Hegde et al, 2013).

The high concordance of the CEA, VEGF and sVEGFR-2 data between the full and subset analyses not only supports the use of different methodologies to assess these markers, but also increased our confidence in the analysis of the 207 markers using a novel multiplexed technology (Jürgensmeier et al, 2013).

A large number of serum proteins were explored for their prognostic value in this first-line mCRC study. All patients in the study received treatment with FOLFOX or XELOX and the initial evaluation across treatment arms (independent of addition of cediranib to the treatment), identified multiple prognostic factors. Due to the large data set, diverse statistical methodology was applied and data from PFS and OS analyses were compared to enable a robust interpretation. In the ‘hypothesis-free’, purely statistical analysis (GBM method), C-reactive protein, cancer antigen 72-4, CEA, cellular fibronectin, insulin-like growth factor binding protein 1, ICAM-1, IL-6, IL-8, IL-18, mannose receptor C type 2, MMP7, tenascin C and TIMP metallopeptidase inhibitor 1 were identified as prognostic factors. A focussed analysis on the signalling pathways related to cediranib identified several additional prognostic markers, namely low levels of VEGF, VEGF-D, VEGFR-1 (Flt-1), VEGFR-3 (Flt-4), neuropilin and Tie-2. High levels of VEGFR-2 (KDR) and c-Kit (mast/stem cell factor receptor) were associated with improved OS outcomes, independent of treatment; however, an improvement was not observed for PFS. All of these markers that were identified to be prognostic showed high baseline expression in the peripheral blood relative to the normal range. It cannot be excluded that additional markers could be identified from the proteins that display a smaller concentration range. Given the large data set, the analyses described here focus on the median cutoff and continuous scale for the GBM method to identify the most relevant markers overall.

There was general consistency between the analyses methods; however, due to the covariate selection processes involved in the GBM routine, some markers may not have been identified by the GBM approach. This is most likely to have occurred when a marker is strongly correlated with another marker that is more closely associated with response. In this situation, the marker less associated with response would be down-weighted in favour of the other marker. This could be the case for VEGF, which was shown to be prognostic in the initial analysis, but was not highlighted by GBM. If a strong correlation is observed between two markers, then the identification of other overlapping factors by GBM cannot be ruled out. While this analysis identifies the most significant independent markers, some other markers with a strong correlation to treatment outcome may be missed; therefore, investigation of the between-marker relationships is warranted and has the potential to be highly informative. When specifically evaluating markers related to cediranib signalling, VEGFR-2 correlated with OS, which was not observed in the primary analysis adjusted for the primary covariates (Figure 1).

These data show that multiple factors influence the benefit that patients may derive from FOLFOX/XELOX chemotherapy and could, if confirmed in independent studies, guide future study design and treatment options for patients. Of particular interest is that many factors involved in VEGF signalling and angiogenesis influence outcome on chemotherapy, which was unexpected. It may be that these prognostic factors dominate the effect, which could mask any further impact of adding a VEGF-signalling inhibitor to the treatment.

A patient selection strategy to effectively identify patients who are most likely to benefit from treatment with VEGF-signalling inhibitors, such as cediranib, has yet to be identified. For the first time, the large sample set from the HORIZON II study allowed a broad exploration of multiple protein markers in serum and their relevance for treatment benefit derived from the VEGF-signalling inhibitor cediranib. However, with the described methodology, predictive markers for benefit on cediranib treatment were not identified with consistent data between PFS and OS. A number of factors might contribute to this disappointing result. In this trial, cediranib plus FOLFOX/XELOX showed a significant improvement in the co-primary end point of PFS compared with chemotherapy alone (HR=0.84, 95% CI 0.73–0.98; P=0.012), but there was no improvement in the other co-primary end point of OS vs chemotherapy alone (HR=0.94, 95% CI 0.79–1.12; P=0.571) (Hoff et al, 2012). Similar results have also been observed with another VEGF-signalling inhibitor, bevacizumab, which improved PFS when added to FOLFOX, but had no significant effect on OS (Saltz et al, 2008). Post-progression therapy was balanced across both arms in HORIZON II (Hoff et al, 2012), suggesting that this was not a major influence on the different outcomes for PFS and OS. However, additional treatment after progression may have affected a correlation of predictive markers between the PFS and OS end points. The markers that were identified to correlate with better PFS therefore warrant further evaluation in future prospective studies with VEGF-signalling inhibitors. In addition, the large number of prognostic markers identified across the trial arms may indicate that the chemotherapy treatment received by all patients was dominating the overall effect on serum markers and that the additional effect of cediranib was masked. This is particularly noteworthy, as markers in the VEGF signalling and angiogenesis signalling pathways have been identified in this study to be related to chemotherapy outcome. Suitable data to study the effect of chemotherapy in treatment-naïve patients are not available, and cannot be derived from any future clinical trial, since a placebo vs chemotherapy study design would not be acceptable because of present treatment standards.

A small single-arm study of FOLFIRI plus bevacizumab in patients with first-line mCRC reported that baseline levels of VEGF and sVEGFR-2 were not predictive for outcome (Kopetz et al, 2010). The study also found that high IL-8 levels, when dichotomised by median, were associated with shorter PFS, consistent with the findings in this study across treatment arms. However, because of the study design without a control arm, only limited conclusions can be drawn regarding the benefit of each treatment. Hegde et al (2013) recently evaluated VEGF levels across different studies (multiple tumour types, multiple comparators and combinations) that included bevacizumab in the treatment arm and found high levels of VEGF to be a negative prognostic factor in line with our findings. For CRC, multiple angiogenic factors have been evaluated in 38 patients treated with capecitabine, oxaliplatin and bevacizumab; however, the study lacked a control arm (Liu et al, 2013). High levels of vWF and Ang-2 correlated with shorter PFS, and baseline levels of VCAM-1, vWF, TSP-2, IL-8, MMP2 and Ang-2 all correlated with OS. There is overlap between the biomarkers identified in our study and factors identified to be potentially prognostic for treatment with pazopanib in renal cell cancer albeit again from a single-arm study (Tran et al, 2012). Taken together, a panel of biomarkers including VEGF, MMPs, ICAM, VCAM, IL-6, IL-8 and TIMP, which have emerged from multiple studies with VEGF-signalling inhibitors, merit further investigation in future prospective studies.

In summary, the findings from this study identify a number of potentially prognostic biomarkers for patients with mCRC treated with FOLFOX or XELOX. While samples from treatment-naïve patients are not available and all patients included in this study received chemotherapy, the identified factors may be predictive markers for chemotherapy (FOLFOX/XELOX) outcome or disease progression. By contrast, there was no evidence to suggest that any subgroups of patients would derive greater benefit from cediranib treatment based on these analyses.