Introduction

Most children with acute lymphoblastic leukemia (ALL) above 2 years of age being candidate to be treated with allogeneic hematopoietic stem cell transplantation (allo-HSCT) receive myeloablative conditioning (MAC) with a fractionated total body irradiation (FTBI)-containing regimen [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17]. It is an important matter of debate if chemotherapy can effectively replace FTBI. Due to the known late effects associated with the use of FTBI, which include endocrine complications (growth impairment, hypothyroidism, and delayed onset of puberty), infertility, cognitive impairment, cataracts, and an increased risk for secondary malignancies, avoidance of FTBI in the preparation of allo-HSCT is desirable [18,19,20,21,22,23]. To date, it has not been shown that FTBI can be successfully replaced by chemotherapy during conditioning for pediatric ALL [2, 3, 5, 24,25,26].

To compare outcomes of FTBI versus chemotherapy-based conditioning (CC) in childhood ALL, we performed this international retrospective registry-based study. The study was initiated and conducted on behalf of the Paediatric Diseases Working Party of the European Society for Blood and Marrow Transplantation (EBMT). The primary endpoint was leukemia-free survival (LFS). Overall survival (OS), relapse incidence (RI), nonrelapse mortality (NRM), and incidence of acute graft versus host disease (aGvHD) and chronic GvHD (cGvHD) were the secondary endpoints.

Patients and methods

Children and adolescents aged between 2 and 18 years undergoing a first allo-HSCT for ALL in first (CR1) or second complete remission (CR2) after MAC with either bone marrow (BM) or peripheral blood stem cells (PBSC) from either a matched-related (MRD) or unrelated donor (UD) between 2000 and 2012 were included in the study. This observation period was chosen in order to obtain a reasonable time of follow-up (FU). Moreover, the prospective international randomized ALL SCTped 2012 FORUM trial was started in 2013 and is still recruiting patients. Data were obtained from the EBMT database ProMISe (Project Manager Internet Server) and analyzed in the EBMT study office in Paris, France. The study was performed in accordance with the Declaration of Helsinki. The local institutional review board at each participating site approved the allo-HSCT procedures. Patients and/or their legal guardians gave written informed consent to use clinical data and research participation. All authors had access to the primary clinical data.

Statistical analysis

The study population was divided into two groups (patients in CR1 and CR2). Patients’ demographic and clinical characteristics were summarized using the median and interquartile range for continuous variables and counts and percentages for categorical variables. Preparative regimens were FTBI versus CC. For both remission groups the two conditioning regimens were compared using Fisher’s exact test or χ² test for categorical variables and Wilcoxon rank sum test for continuous variables [27].

Median FU was calculated using the reverse Kaplan–Meier method. The primary endpoint was LFS defined as the probability of being alive and free of disease at any point in time. Thus, death or disease relapse was treated as events. Patients alive and free of disease at their last FU were censored [28, 29]. OS was defined as the probability of survival irrespective of the disease state at any point in time. Patients alive at their last FU were censored. RI was defined as the probability of having experienced a relapse. Death without experiencing a relapse was the competing event. NRM was defined as the probability of dying without previous occurrence of a relapse, which was considered as competing event. Incidences of aGvHD (grade III–IV), cGvHD, and extensive cGvHD were defined as first event of aGvHD (grade III–IV), cGvHD, and extensive cGvHD, respectively. Death and relapse were considered as competing events. OS, RI, NRM, and incidence of acute and cGvHD were secondary endpoints [30].

The inverse probability weighting (IPW) method using the propensity score was used to calculate weights and adjust for confounding factors between the treatment groups [31]. Confounding factors considered age at allo-HSCT, year of allo-HSCT, time from diagnosis to allo-HSCT, cytomegalovirus (CMV) serology, stem cell source, and sex mismatch (female to male versus other).

The weighted Kaplan–Meier method was used to estimate the standardized probability of survival for LFS and OS, and the weighted cumulative incidence function was used to calculate cumulative incidence of relapse (RI), NRM, acute, and cGvHD [27,28,29,30]. P values to evaluate survival differences between the two conditioning regimens were calculated using a weighted proportional hazards Cox model including center as a random effect [32]. Results were expressed as weighted probabilities, weighted cumulative incidences, and hazard ratio with their 95% confidence intervals (95% CI). All tests were two-sided. The type 1 error rate was fixed at 0.05 for determination of factors associated with time to event. Analyses were performed using the R statistical software, Version 3.4.3 (R Development Core Team, Vienna, Austria). Weights were calculated using the twang R package [33]. The date of analysis was October 1, 2018.

Results

Characteristics of study patients

3.054 pediatric patients from European and non-European EBMT centers in 45 countries were included. Between 2000 and 2012, 2.630 patients received a FTBI-based and 424 patients received a chemotherapy-based MAC before allo-HSCT. 1.498 patients (49%) were transplanted in CR1 and 1.556 (51%) in CR2. In the CR1 cohort, median FU was 6.8 years (FTBI group) and 6.1 years (CC group), while in the CR2 cohort, median FU was 6.2 years in the FTBI and in the CC group. In both remission groups, the two conditioning groups differed significantly with regard to age at allo-HSCT, year of allo-HSCT, time from diagnosis to allo-HSCT, stem cell source, and CMV serology (donor/patient, Table 1). These confounding factors and the different sizes of the two conditioning groups requested adjustment by the inverse IPW method (propensity score, see “Statistical analysis”).

Table 1 Patient characteristics.

Hematopoietic stem cell donors and source

1.626 patients (53%) were grafted from an UD and 1.428 patients (47%) from a MRD. The majority (n = 2.105, 69%) received BM and 949 patients (31%) received PBSC (Table 1).

Preparative regimens

The most commonly applied conditioning regimens were FTBI-based (n = 2.630). In CR1 and CR2, 1.285 (86%) and 1.345 (86%) patients, respectively, received an FTBI-based conditioning. 424 patients received a CC (CR1: n = 213 (14%), CR2 n = 211 (14%), Table 1).

FTBI-based

FTBI/Cy (n = 990, 38%) and FTBI/Eto (n = 784, 30%) were the two most frequent used combinations. The remaining patients received different other FTBI-based combinations (n = 856, 32%, Table 1).

Chemotherapy-based

In the CR1 cohort, 213 patients (14%) received CC. These regimens consisted of Busulfan/Cyclophosphamide (Bu/Cy, n = 68), Bu/Cy/Etoposide (Bu/Cy/Eto, n = 66), Bu/Cytarabine (AraC)/+/−Melphalan (Mel, n = 23), Bu/Cy/Mel (n = 20), Bu/Fludarabine (Flu, n = 20), Bu/Cy/Thiotepa (Thio, n = 14), and Bu/Flu/Thio (n = 2).

In the CR2 cohort, 211 patients (14%) received CC. These regimens consisted of Bu/Cy (n = 68), Bu/Cy/Eto (n = 52), Bu/AraC/+/−Mel (n = 35), Bu/Cy/Thio (n = 18), Bu/Cy/Mel (n = 17), Bu/Flu (n = 13), and Bu/Flu/Thio (n = 8, Table 1).

Outcomes

Patients transplanted in CR1

Five years OS was 68.8% (95% CI 66.3–71.5) after FTBI and 74.1% (95% CI 71.1–77.3) after CC (P = 0.25). Five years LFS was 63.8% (95% CI 61.2–66.5) after FTBI and 61.4% (95% CI 58.0–64.9) after CC (P = 0.83). Five years RI was 22.4% (95% CI 20.1–25.0) after FTBI and 26.9% (95% CI 19.7–36.9) after CC (P = 0.33). Five years NRM was 13.8% (95% CI 11.9–15.9) after FTBI and 11.7% (95% CI 6.9–19.8) after CC (P = 0.47). Incidence of aGvHD grade III–IV at day 100 was 11.8% (95% CI 10.1–13.7) after FTBI and 16.9% (95% CI 10.7–26.7) after CC (P = 0.16). Five years incidence of cGvHD was 24.3% (95% CI 21.8–27.1) after FTBI and 20.8% (95% CI 13.7–31.4) after CC (P = 0.60). Five years incidence of extensive cGvHD was 11.3% (95% CI 9.5–13.4) after FTBI and 8.2% (95% CI 4.3–15.9) after CC (P = 0.54, Table 2, Fig. 1a).

Table 2 Weighted analysis of survival by conditioning regimen of patients in CR1 and CR2.
Fig. 1: Survival by conditioning regimen.
figure 1

a Outcomes of patients in CR1. b Outcomes of patients in CR2. CC chemotherapy-based conditioning, CR1 first complete remission, CR2 second complete remission, FTBI fractionated total body irradiation, LFS leukemia-free survival, NRM nonrelapse mortality, OS overall survival, RI relapse incidence.

Patients transplanted in CR2

FTBI was superior compared with CC in terms of OS, LFS, RI, and NRM. In detail, five years OS was 58.5% (95% CI 56.2–61.6) after FTBI and 35.9% (95% CI 33.0–39.1) after CC (P < 0.0001). Five years LFS was 53.7% (95% CI 51.1–56.5) after FTBI and 29.4% (95% CI 26.6–32.5) after CC (P < 0.0001). Five years RI was 30.6% (95% CI 28.1–33.3) after FTBI and 49.3% (95% CI 40.3–60.2) after CC (P < 0.0001). Five years NRM was 15.7% (95% CI 13.8–17.9) after FTBI and 21.3% (95% CI 15.1–30.2) after CC (P = 0.044).

Significant differences in the incidence of aGvHD grade III–IV at day 100, cGvHD and extensive cGvHD were not detected (Table 2, Fig. 1b).

Discussion

Most pediatric patients with ALL aged above 2 years who undergo allo-HSCT receive FTBI as part of the preparative regimen [1,2,3,4,5, 8,9,10, 12, 13, 15,16,17]. Adverse late effects such as endocrine disorders, infertility, cognitive impairment, cataracts, and increased risk for secondary malignancies, are a major burden of this treatment modality but can at least to a certain extent also occur after CC (e.g., Bu/Cy/Eto) [18,19,20,21,22]. However, to date, it has not been proven whether FTBI could be advantageously omitted from the preparation for allo-HSCT and replaced by CC without jeopardizing LFS [3, 5, 24, 25, 34]. Nevertheless, myeloablative CC remains widely applied in Europe and elsewhere. To compare outcomes of FTBI with CC in pediatric ALL, we performed this multinational retrospective study.

Our study cohort has been intentionally restricted to patients having received first allo-HSCT in CR1 or CR2 after MAC, BM or PBSC as stem cell source from MRD or UD as donors in order to receive a more uniform cohort. In this study, all CC regimens were Bu-based. Bu/Cy, a well-established preparative regimen for pediatric [35,36,37] and adult patients [38, 39], was most frequently applied. Bu/Cy/Eto was the second most frequently used MAC. This combination was applied in the international Berlin–Frankfurt–Münster (iBFM/BFM) clinical trials [1, 40, 41], particularly in infants (Interfant-99) [42, 43], and elsewhere [2, 44,45,46,47]. Within the observation period of this study, the alternative alkylator, treosulfan, was increasingly used for children with malignancies; but no treosulfan-containing regimen reached a significant number of cases [48].

The FTBI and the CC group differed with regard to number of cases, as well as some clinical features, as mentioned above (see Statistical analysis). These potential confounders have been adjusted by the inverse IPW method (propensity score) in order to allow the comparison of the outcomes of the two conditioning groups.

In the CR1 cohort the outcome after FTBI was not significantly different compared with CC. This is a new, interesting finding; although we do not know the reasons for omission of FTBI, which could be manifold: (1) young age, (2) negativity of minimal residual disease before allo-HSCT, (3) high risk for toxicity and infection after having experienced complications during front line therapy, (4) logistical reasons as no access to timely FTBI, and (5) decision of patients/parents. However, due to the large number of participating centers from various countries, there might be some equipoise.

Not surprisingly, overall outcomes of the CR2 cohort were inferior compared with CR1 patients. This was predominantly attributed to a significantly higher RI in the CC group of the CR2 cohort. It was impossible to evaluate risk factors for this difference. One could speculate, that patients with increased risk for toxicity due to pretransplant complications and/or a history of cranial/spinal irradiation were stratified to an irradiation-free conditioning.

Interestingly, outcomes after FTBI were superior as compared with CC with regard to OS, LFS, RI, and NRM in our CR2 cohort. More importantly, superior OS, LFS, and RI of the FTBI cohort did not result in a higher but even a lower NRM compared with the CC cohort.

Various hypotheses concerning the different impact of FTBI on conditioning of ALL patients in CR1 versus CR2 can be made. One could speculate that:

  1. (1)

    In a retrospective study there is no possibility to identify the background of the decision for the given conditioning. Patients in CR1 with unfavorable prognostic factors might have been conditioned with FTBI and those children with a more favorable risk profile might have been treated with CC. This hypothesis might similarly fit to patients in CR2.

  2. (2)

    Patients in CR2 had more often extra medullary leukemia and would benefit from FTBI.

  3. (3)

    Relapsed ALL was more resistant to chemotherapeutic agents and benefited from FTBI as a new treatment element.

The potential superiority of FTBI-based conditioning in pediatric ALL was also demonstrated in literature. In 2000, Davies et al. reported a 3-year LFS of 50% after FTBI versus 35% after Bu-based conditioning (P = 0.005) in a cohort of 627 pediatric patients mainly transplanted in CR1 and CR2 [26]. Three years later, Bunin et al. found a 3-year EFS of 58% after FTBI versus 29% after Bu-based conditioning (P = 0.03) in a randomized cohort of 43 children, transplanted in CR1-3 [2].

The main merit of our study is that it includes a large cohort of pediatric ALL patients who, while in remission, received FTBI as well as myeloablative CC for first allo-HSCT using BM and PBSC from MRD and UD following to several European protocols [39, 40]. Consequently, this retrospective study represents “real-world practice.” On the other hand, this registry-based study has some limitations resulting in the fact that our results must be considered as preliminary. In fact, no data were available on: (1) The administration mode of Bu (intravenous or oral) or use of therapeutic drug monitoring and dose adjustment. (2) Cytogenetics or molecular genetics of ALL. (3) Toxicity or reasons for NRM. (4) Secondary malignancies. (5) Minimal residual disease levels at time of allo-HSCT. (6) Site of relapse after front line ALL therapy or after allo-HSCT. (7) CNS involvement. (8) Date of relapse for patients in CR2. The latter information is necessary for classifying a relapse event as very early, early or late for further patient stratification in classes of risk [49], and for a more detailed analysis of the survival of patients transplanted in CR2. Furthermore, since our retrospective study cohort included B- as well as T-ALL phenotypes, BM and PBSC as stem cell sources, MRD and UD, only children above 2 years of age and spanned an observation time of 13 years, our study population still has a heterogeneous character. Moreover, our non-FTBI-receiving CC cohort is relatively small compared with the FTBI group.

We conclude that FTBI-based conditioning was superior to CC in terms of OS, LFS, RI, and NRM for children undergoing allo-HSCT in CR2, according to the largest study comparing outcomes of FTBI versus CC for first allo-HSCT in pediatric ALL. However, we must stress the preliminary character of the results of this retrospective “real-world-practice” study.

Prospective data comparing FTBI and CC for allo-HSCT in children and adolescents with ALL are urgently needed. Due to the limitations of retrospective studies, it seemed justified to ask whether CC is at least as effective as a FTBI-based conditioning in terms of outcome, toxicity, and late effects in a prospective, preferably randomized clinical trial. The answer to this relevant question will be hopefully obtained by the prospective international, multicenter ALL SCTped 2012 FORUM (“For Omitting Radiation Under Majority Age”) randomized trial, which was initiated in 2012 (EudraCT number: 2012-003032-22).