Introduction
Colorectal cancer (CRC) is among the three most common diagnosed cancer and cause of death in developed countries [
1]. The risk of developing colorectal cancer increases with age and its malignant progression lasts about 10—15 years, thus offering time for an early diagnosis [
2]. The survival chances for cancer patients also depend strongly on the disease stage and can be significantly improved by early detection [
3].
In industrialized countries, screening for CRC is initially made using guaiac-based fecal occult blood tests (gFOBT) or haemoglobin-based fecal immunological tests (FIT). Subsequently, CRC can be detected and monitored endoscopically (colonoscopy) which represents the gold standard method for CRC examination, but lacks a comprehensive application in routine diagnostics [
4]. The use of fecal-based tests such as gFOBT or FIT are widely criticized among clinical experts, since the average sensitivity of stool blood tests are substantially limited [
5,
6]. In addition, false-positive test results can be obtained due to cancer-unspecific internal bleedings, thus leading to potentially unnecessary follow-up examinations [
7].
In order to improve the early detection and survival of CRC, highly sensitive, low-priced diagnostic tools should be developed. Currently, the molecular genetic analysis of circulating cell-free DNA (cfDNA) in the bloodstream of patients (“liquid biopsy”) has proven to be a promising approach [
8]. CfDNA fragments are released from normal and tumor cells as a result of cellular degradation processes or exocytosis [
9,
10]. In multiple cancer entities, including CRC, the concentration of cfDNA in patient plasma was found to be significantly higher compared to healthy control subjects [
11]. Several studies demonstrated that the sensitivity and specificity of cfDNA quantification is superior compared to gFOBT, reviewed in Petit et al. [
12].
A further improvement of cfDNA-based diagnostics was proposed determining the length of the cfDNA fragments [
12‐
14]. This approach comprises the amplification of a short (~ 100 bp) and a long fragment (~ 250 bp) of cfDNA markers via quantitative real-time PCR (qPCR). The ratio between long and short fragments forms the so-called DNA integrity index (DII) and is believed to represent the difference between apoptotic and necrotic cell degradation processes. It has been shown that as a result of necrosis, genomic DNA fragments with a length of > 250 bp are formed, whereas apoptotic nuclease activity results in a fragment length of < 180 bp [
15]. It is assumed that cancer cells presumably initiate necrotic cell death due to an active suppression of p53-mediated apoptosis [
16]. For CRC, some studies reported an increased DII [
13,
14,
17‐
20], whereas others presented a different view, challenging the hypothesis of elevated necrotic cell death associated with increased cfDNA levels in the plasma of tumor patients [
21‐
27].
In addition to nuclear cfDNA (n-cfDNA) analysis, quantification of cell-free mitochondrial DNA (mt-cfDNA) in blood plasma was shown to improve early detection of CRC [
21,
28], presumably due to higher copy number per cell [
29,
30]. Currently, research on the association between an altered mt-cfDNA concentration and presence of CRC resulted in unclear data. A systematic analysis of DNA integrity index with nuclear and mitochondrial markers in distinct tumour stages of CRC has not been published yet.
The aim of this study was, first, to analyze total cfDNA concentration in CRC patients with different histopathological stages. Second, to quantify fragments of well-known n-cfDNA markers KRAS and Alu, as well as mt-cfDNA marker MTCO3 independent from total cfDNA levels and third, to examine the integrity index. We also assessed whether our data can be compared to previous work, that measured these markers as a function of total cfDNA concentration. Finally, we evaluate the diagnostic accuracy of our approach comparing the discriminative ability of all markers and their ratios (a) independent (“equal template concentration”—ETC) and (b) in relation to the total cfDNA concentration (“normalized to total cfDNA”—NTC).
Material and methods
Patients and samples
The study was conducted in accordance with the Declaration of Helsinki and performed under STARD guidelines [
31]. The study is officially registered on DRKS, the german register for clinical trials (DRKS00030257). Local ethics committee approval and informed patient consent was obtained. All colorectal cancer patients were ≥ 18 years of age and had not been treated with radiotherapy or chemotherapy prior to blood sampling. Blood samples of 80 consecutive patients were collected before surgical care. The CRC cohort included patients with UICC stage I (
n = 21), UICC II (
n = 21), UICC III (
n = 20) and UICC IV (
n = 18), confirmed after surgery by histopathological examination according to established standard diagnostic procedures. Blood samples from 50 healthy individuals were provided by Central BioHub®, a commercial Biobank that hosts collections of human biospecimen for scientific research.
All blood samples were prospectively collected using K2EDTA BD Vacutainer® Collection Tube (Becton Dickinson, Germany). On the day of venipuncture, plasma samples were centrifuged at 2,000 × g for 10 min and the supernatants were carefully removed, avoiding the buffy-coat. Plasma aliquots of CRC patients and controls were stored at –80 °C until analysis.
Sample processing, total cfDNA extraction and quantification
A total of 1 ml of blinded plasma samples from all individuals were thawed at room temperature and centrifuged at 16,000 × g for 10 min at 4 °C. The supernatant was transferred to 1.5 ml tube and cfDNA extraction performed using the QIAamp® Circulating Nucleic Acid Kit (Qiagen, Germany) according to the manufacturer´s protocol, except the column-based isolation of total cfDNA, that was performed with centrifugation at 2,000 × g instead of using the vacuum pump. Extracted cfDNA was eluted with 50 µl elution buffer and total cfDNA concentration was determined using Qubit™ dsDNA HS Assay Kit and Qubit™ 3.0 Fluorometer (ThermoFisher, Fisher Scientific, Invitrogen, Germany). The samples were stored at –20 °C prior to quantitative real-time PCR (qPCR) analysis.
Quantity and integrity index of nuclear and mitochondrial cfDNA
To measure n-cfDNA and mt-cfDNA marker quantities, short and long fragments of KRAS, Alu and MTCO3 markers were targeted with qPCR. Note, that n-cfDNA markers in the plasma of healthy individuals and CRC patients were analysed independent from their total cfDNA using equal template concentrations (ETC). Technically, we believed that this experimental approach ensures equal molarity of components in qPCR reactions, and therefore may help to precisely evaluate, whether short and long cfDNA fragments are either increased, decreased or unaltered in both groups.
QPCR was performed on a Light-cycler 96 (Roche, Germany). All cfDNA samples were diluted to a final template concentration of 0.1 ng/µl and 2 µl used as template. All qPCR reactions were performed with 15 µl reaction volume containing 1 × PowerUp™ SYBR™ Green Master Mix and 0.25 µM of primer. Human genomic DNA isolated from pancreatic tissue was used as positive and a no template control as negative control. Cycling conditions consisted of initial denaturation at 95 °C for 2 min, and 40 cycles of of 95 °C for 15 s and 60 °C for 1 min. A standard curve with serial dilutions of genomic DNA (0.005, 0.01, 0.025 0.05, 0.5 1.0, 2.0 ng/µl) was used to calculate a logarithmic trend line and cycle threshold (Ct) values of measured qPCR quantities returned along the trend line. Resulting data (ng/µl) were used to determine plasma cfDNA concentration as follows: Plasma cfDNA concentration = qPCR data (ng/µl) × extraction elution volume (µl) ÷ plasma volume (µl). The final data is expressed in ng/ml using the mean values of qPCR triplicates (ETC).
To compare our results with existing data from literature, we calculated the absolute cfDNA quantity of short and long cfDNA fragments by normalizing the data with the dilution factor of each sample used to obtain 0.1 ng/µl template concentration (normalized to total concentration—NTC).
The DNA integrity index was calculated as the ratio of long to short fragments (e.g. KRAS 305/KRAS 67). Oligonucleotides of KRAS 67, KRAS 305, Alu 115, Alu 247, MTCO3 67 and MTCO3 296 are depicted in Table S
4.
Statistical analysis
First, the group of healthy individuals were compared with the entire CRC cohort, and second, the CRC cohort was subdivided by UICC stage and compared with each other and with the control group. The case numbers are sufficient to differentiate healthy individuals from total CRC patients with medium effect sizes with sufficient statistical power (80%) with regard to the analysed cfDNA markers.
Data was analysed in three steps. First, total cfDNA concentration, short and long fragment quantities and DII scores were summarized descriptively by median and interquartile range (IQR) and graphically presented by boxplots. Additionally, average biomarker levels were compared between analysis groups by Mann–Whitney U-Tests (complete CRC cohort vs. controls) and Kruskal–Wallis-Tests (controls vs. CRC cohort stratified by UICC stage), followed by Bonferroni-Holm-adjusted multiple pairwise comparisons in case of significant omnibus test results.
Second, diagnostic accuracy was assessed separately for each biomarker (stratified by ETC/NTC condition) regarding the ability to distinguish between healthy and CRC cohort individuals and individuals of the several UICC stages, respectively. To keep the number of comparisons manageable, individuals with UICC stages I and II, and stages III and IV were each grouped together for further analyses. Optimal cut-off values were determined using the Youden’s index. To quantify diagnostic accuracy sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and AUC with 95% confidence limits (CI) were calculated.
To assess the combined performance of the biomarkers to differentiate between healthy individuals and individuals with CRC, as well as patients with different UICC stages, a relaxed multinomial logistic LASSO (least absolute shrinkage and selection operator) regression model with tenfold cross-validation to select optimal penalty was performed. In this approach, a penalty term in the regression equation reduces the absolute value of the regression coefficients (possibly to zero) and therefore regulates the impact that a single predictor may have on the overall regression (or inclusion at all) [
32,
33]. During model fitting identifies two important penalty values, one that minimizes the misclassification error and one that represents a stronger penalty that is still within one standard error of the minimum misclassification error. For the reported models, we used the higher penalty value, because it results in a sfimaller number of selected predictor variables and thus more externally valid models. The relaxation undoes the shrinkage of the regression coefficients (unpenalized regression) of those predictor variables with regression coefficients greater than zero. Estimated multinomial logit coefficients were exponentiated and reported as relative risk ratios.
This approach offers some important advantages over classical regression analysis: 1) the selected predictors (corresponding regression coefficients are greater than zero) are more robust for future predictions, as this approach is less susceptible to random noise in the predictor values; 2) problematically high intercorrelations between the predictor values (multicollinearity) can be adequately taken into account; 3) overfitting to the data is avoided.
Diagnostic accuracy performance in predicting disease status was assessed using all biomarkers of the ETC condition or the NTC condition and a combined set of markers of both conditions, with healthy individuals as reference category and individuals with UICC stages I/II and III/IV each collapsed in one category.
Diagnostic accuracy performance in predicting disease status was assessed using all biomarkers of the ETC condition or the NTC condition. Data analysis was performed with R version 4.2.1 (R Software Foundation, Vienna, 2022), especially utilizing the “epiR” package (version 2.0.50) to calculate diagnostic accuracy values, “pROC” (version 1.18.0) to calculate and display ROC-curves [
34], and “glmnet” (version 4.1–4) to fit logistic LASSO regression models [
32].
Discussion
A large number of studies have shown that higher total cfDNA levels can be detected in the plasma of colorectal cancer patients, although, especially in late tumour stages [
11,
12,
35,
36]. To date, it remains unclear how the lengths of nuclear and mitochondrial cfDNA fragments contribute to total cfDNA quantity and whether they can be used as sensitive and reliable biomarkers in CRC diagnostics. It is considered that tumour-derived necrotic DNA degradation results in cfDNA fragments with a length of > 250 bp, whereas normal apoptotic cells produces fragments of around 180 bp or less [
15,
37]. An increased ratio between long to small fragments (DII) was found in a significant number of studies but others did not support this theory [
21‐
27].
Our study was designed to specifically address the question whether short and/or long fragments were altered in CRC patients compared to healthy individuals. Therefore, we intended to precisely measure n-cfDNA and mt-cfDNA independent from total concentration of isolated cfDNA, arguing that the ratio between small and long fragments in cfDNA samples from certain study groups must remain identical.
In agreement with previous findings applying fluorescent-based cfDNA quantification methods reviewed in Petit et al. [
12], we detected a higher concentration of total cfDNA in plasma samples of CRC patients compared to healthy controls. In addition, the level was observed to increase with the pathological stage of CRC, suggesting an increase with tumour malignancy.
Using equal cfDNA concentration (ETC) as template for qPCR analysis, control and CRC patients showed similar levels of both KRAS 67 and Alu 115 markers. This outcome also remains unaltered in all stages of CRC, highlighting that there is no measurable difference in short n-cfDNA fragment levels between healthy and cancer group at ETC condition. However, regarding long fragments of both n-cfDNA markers KRAS 305 and Alu 247, a significant decrease was detected comparing both cohorts. Strikingly, decreased levels of long n-cfDNA fragments was also observed in later stages of CRC, although with a statistical significance only in stage IV, indicating that advanced tumour malignancy inversely contribute to n-cfDNA stability. Likewise, significantly reduced DII scores were detected for both KRAS 305/67 and Alu 247/115. Primarily, significant reductions were observed in UICC stage IV, with the most pronounced decrease of long n-cfDNA fragments. Importantly, the results changed profoundly when the qPCR data was normalized to the actual cfDNA concentration of each sample (NTC). We believe that this approach mostly resembles previous qPCR analysis, in which cfDNA samples were analysed via qPCR independent from its concentration. Accordingly, the level of short n-cfDNA fragments significantly increases in the CRC cohort and with higher pathological stages. This result demonstrates that the n-cfDNA concentration is generally higher in the plasma of CRC patients and confirmed previous findings [
13,
14,
18,
20‐
28]. For long n-cfDNA fragments at NTC condition, the calculated quantities of both KRAS 305 as well as Alu 247 did not significantly differ between healthy individuals and all stages of CRC patients. Although, we noticed a slight increase in the level of both markers from stage II to IV, this observation was without statistical significance. At this point, our findings differed from previous studies, that reported elevated cfDNA levels including long fragmented markers [
20‐
22,
25,
27]. Nevertheless, Mead et. al. reported increased median levels of total cfDNA and Alu 115 rising from control to benign polyps and cancer group, whereas long fragment levels of Alu (247) and Line1 (300) were comparable or even lower in CRC patients compared to individuals with benign polyps [
21]. Of note, Bhangu et al. reported a significantly decreased median level of Line 297 in total CRC patients as well as patients with either stage I-III or stage IV compared to control individuals. However, in contrast to increased levels of Alu 115, no significant difference between controls and patients was found for Line 79 [
26]. Furthermore, the integrity index of patients with different histopathological stages of CRC were reported to significantly decrease in stage IV compared to stage II. This observation might be explained by an increase in the level of Alu 83 that was more profound compared to that of Alu 244 [
22]. A significantly decreased DII was also observed by Yörüker et al. and Sinha et al., with both research groups investigating multicopy transposable elements such as Alu in stage IV CRC patient plasma compared to healthy controls [
23,
27]. Pu et al. found similar median fragment levels of Alu 219 in CRC patients with stage 0, I and II, while only patients in stage IV were found to be significantly higher compared to control individuals. In contrast, Alu 115 values were significantly higher in stage I, II and IV, thus resulting in a decreased cfDNA integrity in all stages compared to healthy controls was observed [
25].
We assume that, under NTC condition, the level of especially short n-cfDNA fragments in the plasma of CRC patients increases proportional to the level of total cfDNA, particularly in advanced stages of CRC. In contrast, long n-cfDNA fragments did not, suggesting that increased levels of total cfDNA in CRC patients is not associated with raised necrotic cellular degradation. This result is also in accordance with a recent view, whereby increased tumor-derived cfDNA quantity predominantly comprises shorter fragments compared to healthy individuals [
23‐
25]. In further support to our view, Mouliere et al. reported that the median-size distribution of cfDNA in the plasma of metastatic CRC patients were lower compared to healthy individuals utilizing Atomic Force Microscopy [
38]. In this context, several recent studies applying next generation sequencing (NGS) techniques on cfDNA extracted from plasma of cancer patients revealed significantly lower levels of long fragmented cfDNA [
39,
40].
The analysis of short and long fragments of MTCO3 revealed a highly significant decrease of both markers in CRC group at ETC condition. Moreover, both fragment levels were significantly reduced in almost all stages of CRC with the exception of stage III. Considering the relevance of especially early stages (I and II) for CRC detection, our data suggests an improvement for cfDNA-based diagnostics using mitochondrial markers. Of note, decreased mt-cfDNA levels were also found when normalizing our data to the total cfDNA concentration (NTC). However, this approach remarkably reduced the difference in mt-cfDNA quantity between CRC and control group as well as UICC stages, relative to the data obtained from equal template concentration. Thus, our findings may have unravelled a weak point of using unequalized total cfDNA concentrations as template in qPCR-based CRC diagnostics (NTC) and provides additional evidence for the usefulness of our approach (ETC). Surprisingly, although weak, the DII of mt-cfDNA fragments in CRC patients was significantly higher compared to healthy individuals. This observation was in contrast to the DII of n-cfDNA and may be due to significant differences between healthy individuals and CRC patients in long and short MTCO3 fragment levels. At this point, our data further suggest a fundamental difference between n-cfDNA and mt-cfDNA with regard to its integrity. However, we can only speculate whether this difference is due to an alternate origin or mode of degradation. Of note, mitochondrial DNA lacks a nucleosomal core structure and it was recently published that plasmatic mt-cfDNA is more stable compared to n-cfDNA [
41]. In addition, it was reported that a substantial amount of entire cell-free mitochondria with intact respiratory metabolism are present in human plasma, next to its known presents in microvesicles [
41]. Nonetheless, in our view, the prognostic value of mitochondrial DII is questionable at this point, since there is no significant difference between the UICC stages. Even so, we confirmed previous findings in which MTCO3 was used as a potential biomarker, whereby mt-cfDNA concentration in CRC patients decreases significantly compared to healthy individuals [
28]. With regard to other mitochondrial target sequences, specifically MTND1, Mead et al. reported an increased mt-cfDNA level in polyp and cancer population compared to control individuals. However, in comparison to an increased Alu 115 quantity, no difference in mt-cfDNA concentration was found between poly and cancer group [
21]. Strikingly, several recent studies demonstrated a significantly higher mt-cfDNA concentration (copy number) in plasma of healthy controls compared to CRC patients applying NGS-based approaches [
40,
42].
Generally, we believed that the contribution of tumour-specific n- and mt-cfDNA might be far too low to explain the elevated total cfDNA level or changes in the DII in plasma of CRC patients. It is therefore necessary to consider other non-malignant cells or cellular processes as a major source of cfDNA. This is supported by research that has shown that stromal, endothelial and immune cells also constitute to the microenvironment of tumour tissue either to support or to oppose cancer formation [
36,
43]. Therefore, although it is plausible that senescent tumour cells in CRC patients are frequently undergoing necrosis, cancer cell death might be covered by enhanced apoptosis from yet unknown cell origin.
An additional important objective of this study was to determine marker-specific cutoffs for the diagnostic use of biomarkers to differentiate groups. Specifically, we report the cut-offs for the following group differentiations: 1) healthy control versus total CRC patients, 2) healthy control versus CRC patients with UICC stage I and II, group 3) healthy control versus patients with stage III and IV, and group 4) UICC stage I and II versus stage III and IV. Therefore, we either used the data measured unrelated to (ETC) or dependent on the total cfDNA concentration (NTC) (Table S
3). Of note, total cfDNA concentration measured spectrophotometrically yielded one of the highest AUCs in all four group differentiations and indicates the best CRC detection rate for patients with later stadium III/IV. With regard to previous studies, in which a comparable detection method was used, our results performed moderately better. For early stages (I/II), an AUC of 0.64 (
P = 0.03) with 42% sensitivity and 75% specificity and for later stages (III/IV) an AUC of 0.63 (
P = 0.003) with 63% sensitivity and 75% specificity was reported [
44]. El-Gayar et al. distinguished CRC patients from healthy donors with an AUC of 73% (
P = 0.004), a sensitivity of 68% and a specificity of 65% [
14].
For the evaluation of diagnostic potential using single markers, best results were obtained for both mt-cfDNA fragments in the ETC condition, thus highlighting the potential of mt-cfDNA biomarkers in early stage CRC detection. This result is in agreement with data from Mead et al., in which ROC curve analysis of a single mt-cfDNA marker were able to significantly (
P < 0.001) differentiate patients (polyps and CRC) from healthy control [
21]. For single nt-cfDNA marker quantities, highest diagnostic accuracies were obtained for longer fragments of KRAS and Alu in the ETC condition and shorter fragments KRAS and Alu in the NTC condition. However, detection of CRC with high sensitivity and specificity was only reached in advanced tumour progression (UICC stages III and IV), suggesting a subordinate importance of these markers for early diagnosis. In consistency with our results, several studies reported a clear discrimination between healthy and advanced or metastatic CRC populations with a high sensitivity and specitivity of single nt-cfDNA markers with most of them targeting the Alu sequence [
18,
21,
26,
27].
In our study, the DII of nt-cfDNA and mt-cfDNA proved to be effective determinants to significantly differentiate the aforementioned groups. However, we conclude that DII scores from both n- and mt-cfDNA markers in our analysis did not performed superior compared to single markers, and especially with regard to the total cfDNA concentration. In our view, this could be explained by the fact that DII determination as the ratio between long and short fragment quantities is limited by the discriminatory capability of either of its “best” single marker. For example, at ETC condition, the median quantity of Alu 247 fragments performed best, whereas in the NTC condition Alu 115 seemed to be a more reliable marker (Figs.
2 and
3).
Using LASSO multinomial logistic regression, a modern and robust statistical technique that circumvents multicollinearity problems between predictor variables that is able to identify important predictors and provide robust estimates, we investigated the relationship between biomarkers and UICC stages of colorectal cancer. We applied this modelling approach to different sets of predictor variables and evaluated the predictive accuracy of the different models using misclassification error. Both models based on a single set of biomarkers (ETC or NTC) resulted in approximately equal diagnostic prediction accuracy, whereas the model that included biomarkers of the ETC and NTC condition had approximately 30% lower misclassification error rates. Interestingly, only long fragmented biomarkers were selected as predictors in the final model, with the exception of the MTCO3 marker (67 bp).
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.