This retrospective study investigated the impact of different types of predictors on the occurrence of adverse effects in 155 participants following treatment with fluoropyrimidine-based chemotherapy regimens using three levels of statistical analyses of increasing complexity. The common findings across all analyses were associations between different adverse effects and associations between genetic and non-genetic predictors and adverse effects.
Bivariate associations between adverse effects and between predictors of adverse effects
We first examined associations between adverse effects following treatment with fluoropyrimidine-based chemotherapy regimens with bivariate analysis given the body of evidence reporting concurrent adverse effects, or symptom clustering, during chemotherapy treatment. The strongest associations were identified between the overall GI toxicity adverse effect and the individual adverse effects (i.e. nausea and/or vomiting, mucositis and diarrhea) that combined to give the overall effect. However, this study also identified other weaker associations. For example, between neuropathy, skin toxicity and cardiotoxicity and other adverse effects. Importantly, the associations between neuropathy and GI effects seen in this study had previously been indirectly linked in a Bayesian network analysis in colorectal cancer patients receiving different chemotherapy regimens [
20]. These associations may be explained by common underlying mechanisms for these different adverse effects, such as the increased pro-inflammatory signaling that occurs with GI adverse effects [
2] and with neuropathy following chemotherapy [
21]. Similarly, the associations between skin toxicity, measured as rash or alopecia in this study, and fatigue, neuropathy and mucositis were expected, given that rash was previously indirectly linked to fatigue, neuropathy and mucositis in a previous Bayesian network analysis [
20]. Finally, the links we observed between cardiotoxicity, measured as palpitations in the current study, and generalized pain and fatigue, are in accordance with previous reports linking palpitations with pain [
20] and fatigue [
20,
22].
Similar to associations between adverse effects, it was important to consider bivariate associations between the investigated genetic and non-genetic predictors prior to logistic regression and Bayesian network analysis to maintain the study power by keeping the number of predictors within the dataset to a minimum. Whilst there were some strong associations detected, for example between the IL1B rs16944 and rs1143627 SNPs, between the DPYD rs67376798 SNP and body surface area, and between body surface area and arthritis, none of these associations were evident in 100% of the participants. Consequently, we did not remove any predictors from further analysis on the basis of these bivariate associations.
Logistic regression models associating genetic and non-genetic predictors with adverse effects
Our next step was to use logistic regression to determine which genetic and non-genetic predictors were associated with each adverse effect with % AUC of ROC curves indicating the sensitivity and specificity of the models that included genetic predictors alone, non-genetic predictors alone, and all predictors.
Briefly, the genetic predictors appeared to play a greater role compared to the non-genetic predictors in the adverse effects of overall GI toxicity, nausea and/or vomiting and constipation, as evidenced by the higher % AUC for the ROC curves and were the sole predictors of neutropenia. Consequently, we have limited our discussion to genetic predictors observed in these models.
With regards to GI-related adverse effects and genetic predictors, a
TGFB1 (rs1800469) SNP was common to overall GI toxicity and nausea and/or vomiting models. The variant allele of this SNP has previously been associated with increased anti-inflammatory TGF-β expression compared to the WT allele [
23,
24]. Consequently, participants with the WT allele who would be expected to express less TGF-β to explain their increased probability of experiencing these adverse effects. The other SNP associated with increased probability of nausea and/or vomiting was
IL6 rs10499563 which aligns with previous research reporting carriers of the WT allele have increased IL-6 expression [
25]. In contrast, it was observed that participants with WT alleles of the
TLR4 SNP rs49986790 had a decreased probability of nausea and/or vomiting. Although there is mixed evidence linking this SNP to changes in function (for review see Ref. [
26]), more recently this SNP has been linked to reduced IL-6 plasma expression following mixed chemotherapy regimens including irinotecan and 5-FU [
27]. Consequently, WT alleles carriers of
TLR4 may express less IL-6 and have reduced inflammation to reduce the probability of nausea and/or vomiting. Therefore, associations with these genetic predictors support the pro-inflammatory hypothesis underlying occurrence of these GI-related adverse effects [
2,
28,
29].
In contrast, it is less clear why SNPs in
IL6R rs8192284 and
DPYD rs2297595 were predictors of increased probability, and
BDNF rs6265 of decreased probability, of constipation. WT alleles of the
IL6R have been previously reported to cause less soluble plasma IL-6R in comparison to variant alleles [
25]. This has the potential to reduce IL-6 signaling and hence, participants with WT alleles would be expected to have decreased probability of experiencing constipation, not an increased probability as observed. In addition, although there have been no reports associating the
DPYD SNP and constipation, it has been recently linked to other adverse effects, including abdominal pain, oral mucositis and diarrhea [
30]. Finally, the association between participants with WT alleles of the
BDNF rs6265 SNP and higher BDNF secretion [
31] having decreased probability of constipation aligns with the role that BDNF plays in GI motility. For example, it has been reported that people with slow-transit constipation, classified by the Rome II criteria and having intestinal transit time of ≥ 96 h, had 62% lower colonic BDNF expression compared to controls [
32]. Therefore, due to unclear nature of the impact of some of these genetic predictors, further detailed research is required to further elucidate associations with constipation. This research would also need to investigate the impact of drug-related constipation (e.g. from the use of opioids or anti-motility agents) as an additional predictor for this adverse effect which was not captured in the current study.
Similarly, the underlying mechanisms associating the observed genetic predictors to neutropenia are also unclear in light of previous literature and paucity of reports regarding the functional impact of these SNPs. For example, an increased probability of neutropenia with WT alleles of
BDNF rs6265 and hence increased BDNF secretion was not expected as BDNF has previously been shown to promote immune recovery after radiation treatment in a pre-clinical model [
33]. In contrast, a recent pre-clinical report of
Lacticaseibacillus casei treatment reducing both 5-FU-induced leukopenia and
TLR4 gene expression in the jejunum [
34] may explain why carriers of WT alleles of
TLR4 rs4986790 had a decreased probability of neutropenia. Whilst the link between the
CASP5 rs554344 SNP WT allele and decreased probability of neutropenia may be explained by the key role of caspase-5 in the inflammasome (for review see Ref. [
35]). Finally, the observed mixed (decreased and increased) probability of neutropenia in carriers with 1 or 2 copies of WT alleles of
TGFB1 limits meaningful discussion of a possible association.
In contrast to the adverse effects above, non-genetic predictors played a greater role in diarrhea, mucositis, hand–foot syndrome, cardiotoxicity and fatigue models, and were the sole predictors in the neuropathy, generalized pain and skin toxicity models. As most predictors were associated with only one type of adverse effect, we have limited our discussion here to the most common predictors across the adverse effects.
Firstly, the treatment hospital (Royal Adelaide Hospital) was associated with a decreased probability of diarrhea, mucositis, cardiotoxicity, fatigue and neuropathy adverse effects. However, given the retrospective nature of the current study and differences in potential for data collection from clinical records compared to research purposes, we believe this could be explained by underreporting of the adverse effect in records at that hospital, rather than a true difference. Indeed, there is evidence regarding reporting differences between not only health professionals [
36], but also between health professionals and self-reporting by people receiving chemotherapy [
37] which highlights some of the difficulties in replicating adverse effect data collection and collation across time and clinical settings. Unfortunately, the data collection method of the current study prevents further elucidation, and consequently prospective investigations are required to confirm or reject this possibility.
Of the other non-genetic predictors investigated, type of chemotherapy regimen was also commonly associated with these adverse effects, for example increased probability of neuropathy and generalized pain was observed with 5-FU+ chemotherapy (ECF, EOF, FEC, FOLFOX) and capecitabine (alone or with oxaliplatin) chemotherapy. It is well established that nervous system toxicity is experienced more commonly following certain regimens that contain oxaliplatin and cisplatin platinum-based chemotherapies (for review see Ref. [
38]) such as ECF, EOF and FOLFOX. Therefore, an increased probability of neuropathy and generalized pain would be expected to be associated with these regimens.
Finally, smoking status was a common predictor associated with fatigue and skin toxicity adverse effects. With regard to fatigue, our observations that smokers had an increased probability of this adverse effect aligns with a report that long-term fatigue post-chemotherapy was related to smoking [
39]. However, the association with skin toxicity remains unclear as both non-smokers and smokers had a decreased probability of this adverse effect compared to ex-smokers.
The overall contribution of all predictors to the adverse effects was also assessed and improved the % AUC of the ROC curves by greater than 10% for the constipation and hand–foot syndrome adverse effects. The possible mechanism underlying the influence of the genetic predictors and constipation was discussed above. In contrast, the association of the non-genetic predictors requires further investigation. For example, the impact of chemotherapy regimen-type predictors with constipation remains unclear but could be explained by different drug classes (e.g. platinum-based, oxaliplatin [
40]) in the different regimens. Similarly, the associations of the genetic predictors with hand–foot syndrome, namely the
TLR4 (rs4986790) SNP with decreased probability, and the
IL10 (rs1800871) SNP with increased probability remains unclear with no previous reports in the literature. Whilst the mixed (increased and decreased) probability of association of chemotherapy regimen types with this adverse effect mostly likely is explained by the differing potential of chemotherapeutics to cause this adverse effect (for review see Ref. [
41]).
With regards to validating our pilot study findings that identified
TLR2 and
TNF genetic and colorectal and gastric cancer type non-genetic predictors of overall GI toxicity [
11], a combined model in the current study only improved the % AUC of ROC curves from the genetic predictors model by 6% and observed different genetic (
TGFB1) and non-genetic predictors. Although both were retrospective studies, there were a number of differences in the pilot vs current study that could explain the differences. These included the number of participants (34 vs 155), frequencies of the different cancer types (breast: 38 vs 46%; colorectal: 53 vs 48%; upper gastric 9 vs 6%) and chemotherapy regimen types (5-FU/LV: 17 vs 15%; capecitabine: 15 vs 11%; 5-FU+ : 68 vs 74%), the time when the study was conducted (2009–2012 vs 2012–2018) and recruitment from different treatment hospitals (Flinders Medical Centre vs Flinders Medical Centre and Royal Adelaide Hospital). We believe the latter two factors in particular may explain the differences in models, especially the degree to which the adverse effects were reported and the quality of that reporting in the clinical records as discussed previously.
Bayesian network associations between predictors and adverse effects
Given the above complexity of understanding both the genetic and non-genetic predictors associated with the adverse effects from the other analysis approaches, it was important to attempt to combine all available information in the current study using the more stringent Bayesian network analysis that also accounted for the severity/grade of the adverse effect. This novel analysis reduced the number of links directly and indirectly with both genetic and non-genetic predictors and included direct links between the different adverse effects and the severity/grade of that effect.
Despite not replicating our pilot findings, the links in the network between some SNPs were expected from previous studies. For example, between
IL1B rs1143627 and rs16944,
IL10 rs1800871 and rs1800896,
TLR4 rs4986790 and rs4986791, and
CASP5 (previously
CASP1) rs554344 and
CASP1 rs580253 [
42,
43], and were observed, but not specifically reported, in our pilot study cohort [
11]. In contrast, the link between the
MD2 (
LY96) rs11466004 and
IL10 rs1800871 SNPs has not been previously reported. The direct links between adverse effects and predictors, some of which were novel, were minimal and included: treatment hospital and cardiotoxicity/fatigue;
IL6R rs8192284 and constipation; sex and diarrhea; cancer type and neuropathy; fluoropyrimidine-based chemotherapy regimen type and neuropathy/hand–foot syndrome; alcohol use and generalized pain; and various links between the different adverse effects. Also of note were nodes directly linking different comorbidities which had some basis given our understanding of the predictors of these comorbidities, for example hypertension and cardiovascular disease. Although of great interest, the novelty and descriptive nature of this network (the first Bayesian network to include genetic and non-genetic predictors) limited our ability to compare our network with others in the literature. Consequently, these observations will require future studies to replicate the links observed that would also examine the network stability with bootstrapping or other methods. In addition, as the severity/grade of the adverse effects was not directly linked to other genetic and non-genetic predictors, rather only indirectly linked through the adverse effect itself, the clinical significance of these indirect links with regards to people at risk of grade 3 severe adverse effects requires further elucidation.