Introduction
Ovarian cancer continues to be the leading cause of death among women with gynaecological malignancies in developed nations (Parkin et al
.
2005). In the USA, the prevalence of ovarian cancer in postmenopausal women is 1 in 2,500 and the lifetime risk of a woman developing ovarian cancer is 1 in 72 (1.39%). The age-adjusted incidence and death rates for ovarian cancer are 13.3 and 8.8 per 100,000, respectively. The overall 5-year survival rate is 45.5%. Five-year survival rates are inversely related to the stage of disease at first diagnosis. Early stage ovarian cancer is asymptomatic and only 19% of cases are first diagnosed as localised primary cancer (i.e. Stage I). The corresponding 5-year survival rate for Stage I disease is 92.7%. The majority of cases (67–74%), however, are diagnosed with metastatic disease (i.e. Stages III and IV), when the 5-year survival rate is only 30.6% (Ries et al
.
2006).
The diagnosis of localised, primary cancer and the development of tests with better diagnostic efficiency are undoubtedly the major priorities for achieving long-term reduction of mortality due to ovarian cancer (Paley
2001). Indeed, available data support the hypothesis that ovarian cancers may be detectable up to 2–5 years prior to their clinical presentation (Jacobs et al
.
1993,
1996,
1999; Zurawski et al
.
1988) and that if effective screening for Stage I disease was achieved with an accuracy of 80% or more, mortality would be halved (Bast et al
.
1983). The development of effective community-based screening or earlier detection tests for ovarian cancer, however, is challenging because of the low prevalence of the disease (Parkin et al
.
2005).
The most well-characterised biomarker for ovarian cancer is CA-125 (Nossov et al
.
2008). Serum concentrations of CA-125 are elevated (i.e. >35 U/ml) in more than 90% of patients with late stage disease but are elevated in only 50% of patients with Stage I disease (Nustad et al
.
1996). It is becoming evident that the single biomarker approaches for the detection of early stage ovarian cancer may never realise the diagnostic efficiency requisite for implementation as a community-based screening and alternate approaches, including the combination of multiple biomarkers may be required (Badgwell and Bast
2007).
The quantification of multiple blood-borne biomarkers and the use of multivariate classification models represent a promising approach for improving diagnostic efficiency. Such biomarkers may represent: unique tumour-derived or over-expression products; products elaborated via aberrant neoplastic processing/modification of host proteins; and/or host response proteins elicited by the presence of the tumour and which may display profiles that vary and/or are specific for different types of tumours.
Previous studies have established proof-of-principle and reported significant improvements in the detection of ovarian cancer using such approaches. For example, Gorelik et al. (
2005) measured seven serum biomarkers in 44 newly diagnosed women with early stage ovarian cancer (i.e. Stages 1 and II) and 45 healthy controls and employed five of the biomarkers (including CA-125) in a classification tree analysis to achieve a sensitivity of 84% at a specificity of 95% for the detection of early stage disease. More recently, Visintin et al
. (
2008) in a study of 156 newly diagnosed ovarian cancer patients (48 Stage I, 62 Stage II, 64 Stage III and 64 Stage IV) and 362 healthy women utilised six serum biomarkers (including CA-125) in a multivariate classification model and reported a sensitivity and specificity of 95.3 and 99.4%, respectively, for the overall detection of ovarian cancer.
To further evaluate the utility of a multimarker approach for the diagnosis of ovarian cancer, in this study a phase II biomarker trial (Pepe et al
.
2001) was conducted to evaluate the performance of a panel of plasma biomarkers (CA-125, CRP, IL-6, IL-8 and SAA) to correctly classify women with ovarian cancer. The retrospective, case–control, modelling/validation study was designed to test the primary hypothesis that the area under the receiver operator characteristic curve (AUC) for the biomarker panel was significantly greater than the AUC for CA-125 alone. The data were additionally stratified to assess the performance of the biomarker panel to correctly classify women with early stage ovarian cancer (i.e. Stages I and II).
Materials and methods
Sample collection
Blood (10 ml) was collected via vena puncture into EDTA vacutainer tubes. Samples were centrifuged at 1,000×
g for 10 min within 20–30 min of collection. Plasma was stored as 250–500 μl aliquots and stored at −80°C until assayed. All plasma samples analysed were not stored for greater than 6 years. In a preliminary study, analyte concentrations were found not to correlate significantly with duration of storage. Additional disease and control samples were provided by the Biobank at Peter MacCallum Cancer Research Institute. Samples were randomly selected from the sample bank. Inclusion and exclusion criteria into the trial are detailed in Table
1. The distribution of samples by disease stage, tumor type and patient age are summarised in Table
2.
Table 1
Criteria for inclusion and exclusion in the Phase II biomarker trial
Age 18–90 | Chemotherapy, biologic therapy or any other investigational drug for any reason within 28 days prior to sampling |
Newly diagnosed, pathologically confirmed diagnosis of epithelial carcinoma of the ovary | Except for cancer-related abnormalities, patients should not have unstable or pre-existing major medical conditions |
No NSAID or anti-inflammatory steroids used within 14 days of sampling | Major surgical procedure, open biopsy, or significant traumatic injury within 28 days prior to sampling |
No previous chemotherapy or radiation therapy | Minor surgical procedures, fine needle aspirations or core biopsies within 7 days prior to sampling |
No concurrent disease(s) | Serious, non-healing wound or ulcer |
Signed informed client consent | |
Table 2
Age, stage and tumour type distribution within the all cases (n = 150)
Age Mean ± SE Median (range) | 59 ± 1.8 56 (41–79) | 60 ± 1.7 61 (24–85) | 57 ± 2.0 57.5 (18–89) | 66 ± 4.0 66 (51–59) | 53 ± 4.3 53 (38–69) | 59 ± 1.0 58 (18–89) | |
Serous carcinoma | 10 | 41 | 38 | 7 | 3 | 99 | 66.0 |
Endometroid carcinoma | 6 | 6 | 3 | | | 15 | 10.0 |
Mucinous carcinoma | 5 | 3 | 3 | | | 11 | 7.3 |
Clear cell carcinoma | 1 | 6 | 1 | | 3 | 11 | 7.3 |
Mixed carcinoma | 3 | 3 | | | | 6 | 4.0 |
Untyped carcinoma | 3 | 4 | 1 | | | 8 | 5.3 |
Total | 28 | 63 | 46 | 7 | 6 | 150 | |
% | 18.7 | 42.0 | 30.7 | 4.7 | 4.0 | | |
Study design
A phase II biomarker trial design (Pepe et al
.
2001) was employed to assess the diagnostic efficiency of a biomarker panel for the detection of ovarian cancer. The study was a retrospective, case–control design in which a multivariate classification model was developed using a modelling sample cohort,
n = 179 (Table
3). The multivariate classification model (diagnostic rule) was validated using an independent sample cohort (
n = 183). The primary outcome of the study was to test the hypothesis that the area under the received operator characteristic (ROC) curve for the biomarker panel (AUC
A) was significantly greater than for CA-125 alone (AUC
C). It is acknowledged that most informative biomarkers will increase the AUC by 0.05 or more, and that good risk prediction models will have an AUC greater than 0.7 (May and Wang
2008). Secondary outcomes of the study were: (1) to estimate the sensitivity and specificity of the multimarker panel; and (2) to determine the relationship between the predicted posterior probability for membership of the disease class (
ρP, derived from multivariate modelling) and disease stage and type. The performance of the diagnostic rule was also evaluated using a subset of the validation cohort: all controls + Stages I and II cases only, designated the early stage cohort
.
Table 3
Sample distribution between the model and validation cohorts
a. All stages |
Modelling | 179 | 97 | 82 | 52 | 30 |
Validation | 183 | 115 | 68 | 39 | 29 |
b. Early stages |
Modelling | 179 | 97 | 82 | 52 | 30 |
Early stage | 154 | 115 | 39 | 39 | 0 |
Sample processing
Where necessary plasma samples were diluted appropriately for each assay according to manufacturers, specifications using a phosphate buffer containing bovine serum albumin (Sigma, St. Louis, MI, USA). In brief, for IL-6 and IL-8 assays, plasma samples were diluted 1:4, and for SAP, SAA and CRP assays plasma samples were diluted 1:2,000. Plasma CA-125 concentrations were assayed without prior dilution.
Frozen plasma samples and dilutions were thawed on ice prior to assay. All assays were performed in accordance with manufacturers’ instructions. All assays contained supplied standard curve samples of known analyte concentration. All standards, controls and patient samples were assayed in duplicate. Upon completion of each multiplex assay, a 5-parameter fit equation was employed to generate standard curves, from which mean values for each sample were calculated.
Biomarker quantification
Multiplexed bead-based assays were used to measure all analytes on a Biorad Bioplex 100 platform, with the exception of CA-125, which was assayed using a Roche modular E170. Multiplexed interleukin-6 and interleukin-8 assays (BioPlex®) were obtained from Bio-Rad Laboratories, Hercules, CA, USA and data are reported as pg/ml (LD = 10 pg/ml, intra- and inter-assay CV = <15 and <30%, respectively). Multiplexed serum amyloid A (SAA, ng/ml, LD = 0.2 pg/ml, intra- and inter-assay CV = 3.8 and <19.8%, respectively) and C-reactive protein (CRP μg/ml, LD = 6 pg/ml, intra- and inter-assay CV = 8.0 and <17.5%, respectively) assays were obtained from Millipore (Billerica, MA, USA). CA-125 was quantified using Roche CA-125 Elecsys II assay (Roche, Mannheim, Germany, LD = 0.6 U/ml; intra- and inter-assay coefficients of variation CV = 3.3 and 4.3%).
Statistical analysis
Statistical analyses, model development and samples classifications were performed by an independent biostatistian (Emphron Informatic Pty Ltd., Toowong, QLD, Australia). The primary outcome of the Phase II biomarker trial was the statistical comparison (Wilcoxon statistic (Waegeman et al
.
2008), see below) of the area under the curve of the ROC curves for the biomarker panel and CA-125. Two sample group comparisons of median values were assessed by Mann–Whitney tests. Multiple group comparisons were assessed by Kruskal–Wallis tests (Kruskal and Wallis
1952). Dunn’s tests (Dunn
1964) were used for post hoc two sample comparisons. A
p value of <0.05 was ascribed as statistically significant.
Modelling
As previously described, all samples (
n = 362) were randomly assigned to two cohorts: the first was designated as the modelling cohort (
n = 179) from which a classification algorithm was generated; and the second as the validation cohort (
n = 183) which was used to establish the performance of the classification algorithm. A multivariate classification model was developed, based upon biomarker data obtained from the modelling cohort, using a stochastic gradient boosting model with a logistic loss function as previously described (Friedman et al
.
2000). The boosted logistic regression algorithm was implemented within the R statistical programming environment (Team
2003). The implemented classification algorithm reported a predicted posterior probability value (i.e. the likelihood that a sample came from a woman with ovarian cancer, that is
ρP) for each patient sample.
ρP values were used to generate ROC curves for the biomarker panel. Biomarker data obtained from the validation cohort and the early stage cohorts were submitted to the classification algorithm to establish diagnostic efficiency (i.e. the proportion of samples correctly classified by the modelling algorithm). For classification of samples based on plasma CA-125 concentrations, a threshold value of ≥35 U/ml was used. A threshold value of 0.3 was used for the classification of samples based on
ρP.
Receiver operator characteristic curve comparisons
The diagnostic performance of the biomarker panel and CA-125 alone were assessed by comparison of the area under ROC curves (Hanley and McNeil
1982). The ROC curve for the biomarker panel was based on
ρP values. The area under the ROC curve (AUC) was calculated using the Wilcoxon statistic (Waegeman et al
.
2008). As the AUC for the CA-125 and for the biomarker panel are not statistically independent, since they are based on the same patients, the difference in AUC between the diagnostics were statistically assessed using a bootstrap procedure (Efron
1986). The number of bootstrap samples used in this analysis was
n = 100,000. The estimators considered were the area under the ROC curve as well as the difference between the AUCs, and the measures of accuracy were the 95% confidence intervals.
Discussion
The primary objective of this study was to test the hypothesis that the area under the ROC curve for a multimarker ovarian cancer panel (CA-125 and acute phase response proteins: CRP, SAA, IL-6 and IL-8) was significantly greater than that observed for CA-125 alone. A phase II biomarker trial (a retrospective, case–control study) that involved 362 patient samples was conducted to test this hypothesis.
All biomarkers were significantly elevated in association with ovarian cancer, however, individually none of the biomarkers displayed greater diagnostic performance than CA-125. With respect to the area captured under the ROC curves by individual biomarkers, CA-125 > SAA > CRP > IL-8 > IL6. When biomarker data were used to generate a multivariate classification model, the ρP values from the biomarker panel captured 0.988 of the area under the ROC curve which was significantly greater than that observed for CA-125 alone (0.960, p < 0.01). When applied to early stage cases alone (i.e. Stages I and II), the performance of the biomarker panel was similar (0.985) however, CA-125 performance decreased to 0.937 (p < 0.01), data consistent with an increased diagnostic efficiency of the multimarker panel for early stage ovarian cancer.
The area under the ROC curve was used as the primary statistical endpoint as this parameter is considered less susceptible to variations in mix of true positive and negative samples within the study cohort.
A measure of the reproducibility of biomarker panel performance was obtained by the re-assay of study samples in a second laboratory by an independent operator. The ρP values obtained from both laboratories were not significantly different.
The concentration of analytes reported in this study for controls and cases are comparable with those previous published (Geisler et al
.
1996; Johnson et al
.
2008; Kodama et al
.
1999; Lambeck et al
.
2007; Maccio et al
.
1998; Moshkovskii et al
.
2005a; Woolas et al
.
1993). The diagnostic efficiency of CA-125 within the study cohort (all FIGO stages) was 91.1%. The diagnostic efficiency of CA-125 for ovarian cancer has been previously reported between 70 and 90% (Park et al
.
1995; Saraswathi and Malait
1995; Visintin et al
.
2008).
All biomarkers utilised in the panel have been previously associated with ovarian cancer. Bertenshaw et al
. (
2008) reported CRP and IL-8 concentrations as being amongst the most informative ovarian cancer serum biomarkers in a multianalyte profiling study. Similarly, IL-6 and IL-8 have been reported to be elevated in serum of patients with ovarian cancer (Darai et al
.
2003; Lambeck et al
.
2007; Lokshin et al
.
2006) and utilised in multimarker panels for the detection of ovarian cancer (Gorelik et al
.
2005). In the latter study, IL-6 and IL-8 were used in combination with other cytokines and CA-125 in a classification tree analysis to deliver a test with greater sensitivity and specificity than CA-125 alone. IL-6 and IL-8 are pleiomorphic cytokines that have been also implicated in aspects of tumor growth, disease progression and/or treatment (Hefler et al
.
2003; Wang et al
.
2005,
2007; Xu and Fidler
2000).
CRP and SAA are major components of the acute phase response (Pepys and Baltz
1983). Several studies have reported elevated serum concentrations of CRP in association with ovarian cancer (Avall Lundqvist et al
.
1989; Hefler et al
.
2008; Kodama et al
.
1999; Maccio et al
.
1998; McSorley et al
.
2007). Only limited data, however, are available on SAA concentrations in ovarian cancer patients (Helleman et al
.
2008). Serum concentrations of CRP are correlated with IL-6 and high concentration has been reported to be a significant factor in prognosis of ovarian cancer (Kodama et al
.
1999; Maccio et al
.
1998). Indeed, high CRP is reportedly a risk factor for developing ovarian cancer (McSorley et al
.
2007).
In 2005, Moshkovskii et al
. (
2005b) identified two forms of SAA using SELDI-TOF mass-spectrometry. The authors provided evidence that the presence of both forms in 55% of ovarian cancers compared to only 6% of healthy controls indicative that an N-terminal truncated form of SAA may be significant for diagnosis. SAA was further identified in a proteomic study by Helleman et al
. (
2008) as a potential marker for monitoring of disease progression, where in combination with CA-125 and seven other markers, a sensitivity of 91–100% was achieved. CRP and SAA have been implicated in a range of neoplastic diseases (Weinstein et al
.
1984).
The data obtained in this study are consistent with and support previous observations of the association between elevated acute phase proteins and the presence of ovarian cancer. When such biomarkers are used in combination with CA-125, diagnostic efficiency for ovarian cancer is increased overall (validation cohort) and for early stage disease (early stage cohort). At ρP values of 0.3–0.5, the biomarker panel delivers a balance between sensitivity and specificity, and displays a false positive rate of 6–8%. At this level of performance, while the biomarker panel would reduce by 30–50% the number of women misdiagnosed with cancer by CA-125, it would not be suitable as a screening modality.
The study reported herein is a retrospective, case–control design and the diagnostic performance parameters reported cannot be extended beyond the context of the study. Additional studies, therefore, are required to assess the clinical utility of such multimarker tests within both high-risk cohorts (including women with a genetic predisposition to ovarian cancer) and within the general population (where reliable estimates of positive and negative predictive values may be obtained) (Coates et al
.
2008). The utility of biomarker panels for the diagnosis of ovarian cancer, such as that reported herein, may be further enhanced if used in a multimodal approach, for example, in combination with a symptom index as recently described by Andersen et al
. (
2008).