2.1 KMT2A rearrangements
The
KMT2A gene has a myriad of fusion partners. For ALL, the most important one is surely
AFF1, with
MLLT3 (
AF9) and
MLLT1 (
ENL) also being more frequent than others [
5].
KMT2A translocations frequently occur in infants, and some newborns already show signs of full-blown leukemia [
15]. Therefore, these translocations are a natural candidate for prenatal development. Among several studies that investigated the prenatal status of ALL in general and
KMT2A-rearranged ALL in particular, were those that first traced ALL back to an
in utero event [
10,
15] and the first to trace it back to Guthrie cards [
17]. Uckun et al. [
13] showed actual
in utero presence of the fusion gene in fetal tissue from abortions. This study also found one case of a healthy infant expressing the
KMT2A-AFF1 fusion transcript. This suggests that
KMT2A fusions are also present in healthy individuals and will not necessarily lead to overt leukemia. However, it is unknown whether this infant developed leukemia later on. Additionally, other studies [
42,
43] failed to reproduce these findings, leaving the question of whether
KMT2A translocations also occur more frequently than the corresponding leukemia at least in part unanswered. It is of note, though, that one should expect to find
KMT2A fusions in fetal tissue, even if leukemia development was inevitable. For that to happen, the cohort size would simply have to be much larger than the 29 samples studied by Uckun et al. [
13].
There are two possibilities that explain how the
KMT2A fusions contribute to leukemia development and the short latency periods after birth: (1) The fusion itself is sufficient for leukemia onset. This would mean that leukemia development is inevitable and no healthy individuals carrying the fusions exist. (2) A secondary mutation is required, but is triggered by the fusion protein. This would also be in line with the short latency. If the fusions trigger the additional mutation, leukemia development might be inevitable, but it would allow for the theoretical possibility of healthy carriers (Fig.
2). As only one such case has been described [
13], it is not possible to infer from the presence or absence of healthy carriers which model is actually at work.
2.2 ETV6-RUNX1
There is ample evidence that the translocation t(12;21) leading to the fusion of the transcription factors
ETV6 and
RUNX1 predominantly, maybe even always, arises
in utero [
11,
18]. This was first shown in a twin study in 1998 [
11]. Here, both twins had exactly the same breakpoint, something that had never before been described for
ETV6-RUNX1+ leukemia [
11]. This supports a model in which the preleukemic clone arises in one twin and spreads to the other
via the shared placenta. The secondary mutations in the twins differed, hinting at postnatal origin. Additionally, this and several other studies were able to trace back this leukemia type to Guthrie cards [
11,
18,
19,
31,
33,
34].
The
ETV6-RUNX1 fusion alone is not sufficient for leukemia development. For that, secondary postnatal mutations are necessary. Therefore, not without controversy regarding the frequency of the translocation, several studies have investigated the
ETV6-RUNX1 fusion in healthy individuals, especially newborns. Initially identified in umbilical cord blood of one healthy newborn and the peripheral blood of 13 healthy children and adults [
32],
ETV6-RUNX1 was shown to be present in ~ 1% of newborns [
6]. Several Danish studies later challenged these findings [
36,
44‐
47], but newer reports confirmed the original results [
7,
37‐
39].
There are several possible explanations for the contradicting results of these studies. The material used is one of the factors that can influence the outcome. All studies used umbilical cord blood (UCB) for the investigation of newborns. However, some studies used fresh UCB, handled within 24 h of blood draw, while others used frozen UCB or did not specify whether the material was fresh. Interestingly, all studies that identified no or very few
ETV6-RUNX1+ cells in the UCB used fresh UCB [
45,
46,
48] or in one case fresh embryonic liver [
44]. Using freshly harvested cells has the advantage of accurately representing the neonatal hematopoietic environment. It does, however, require a great deal of time and money. It is unlikely that the different results are influenced by the use of fresh or stored UCB, because (1) one study by Ornelles et al. [
38] used fresh UCB and identified 2.38%
ETV6-RUNX1+ samples, and (2) storage has a negative effect on RNA, especially when RNA is released from dead cells [
49], and therefore the studies using frozen UCB should have found fewer
ETV6-RUNX1+ cells. Then again, it was shown that apoptotic signals can induce double-strand breaks in both
ETV6 and
RUNX1 and that this can lead to the
ETV6-RUNX1 fusion [
32]. Storage therefore could induce the translocation but probably at very low levels and in much fewer samples than reported by the studies using frozen UCB [
6,
7,
37]. Also, if the freezing induced the
ETV6-RUNX1 fusion, Ornelles et al. [
38] should not have found any positive samples.
A more likely cause of the different results is the use of different detection methods. Most studies used nested reverse transcriptase PCR (nRT-PCR) or quantitative RT-PCR (qRT-PCR). The advantage of qRT-PCR is that it allows for quantification of the fusion transcript. The nRT-PCR may be more sensitive, as it uses a nested PCR setup, but it is not quantitative. Both methods are, like all RNA methods, vulnerable to contamination, the nRT-PCR approach even more so as it is an open-tube technique. However, contaminations in qRT-PCR can also lead to overestimation of prevalence. However, almost all studies regardless of results used qRT-PCR, and some used multiple techniques for validation. Mori et al. [
6] used nRT-PCR and then qRT-PCR and FISH to validate their finding that ~ 1% carried the fusion. Lausten-Thomsen et al. [
46] initially found 14 of 1417 (~ 1%) samples to be
ETV6-RUNX1+ by qRT-PCR. After dot-blot validation, nine positives remained. It was only the second validation with RNA of flow-cytometric-sorted frozen UCB cells that led the authors to conclude that the results were falsely positive. Hence, the specific method used may positively or negatively impact the detection of the
ETV6-RUNX1 fusion, in combination with the quality and quantity of the input material. Low-quality or -quantity input material might lead to false-negative results. Ultimately, all studies but one used RNA as basis for their analysis. DNA is more stable than RNA by a factor of 10,000 when stored frozen [
49] and is thus the better choice for stored material. Furthermore, RNA produces the same fusion point for every breakpoint. This is advantageous for screening purposes but makes identification of contaminants impossible. Identical breakpoints on the DNA level have only been reported for identical twins [
11]. Hence, a possible contamination is easy to detect. To date, we have conducted the only study identifying
ETV6-RUNX1+ cells
via DNA quantification [
7]. We used the novel GIPFEL technique [
50], allowing for the indirect identification of chromosomal translocations at the DNA level. In this study, we identified 5% of healthy newborns to be
ETV6-RUNX1+. Additionally, we sequenced the chromosomal breakpoints of five positive samples.
One could argue that differences between populations might lead to different
ETV6-RUNX1 frequencies in the healthy population. Population differences have been identified for some tumor entities, including
ETV6-RUNX1+ and
TCF3-PBX1+ leukemias, the latter of which is more frequent in Latin America [
51,
52].
ETV6-RUNX1 is much less common in East Asians [
53], Hispanics [
54], and especially in Maori, where only 5.4% of pediatric ALL cases harbor this translocation [
55]. Interestingly, the survival rates of
ETV6-RUNX1+ Maori did not differ from those of other ethnicities, probably due to equal access to ALL treatment for all in New Zealand [
55]. Except for the Japanese study by Eguchi-Ishimae et al. [
32] and the US-American study by Ornelles et al. [
38], all studies used primarily Caucasian European populations. Therefore, an influence of the population on the frequency of
ETV6-RUNX1 is highly unlikely.
Notably, all studies that could not identify
ETV6-RUNX1+ newborns were conducted with a Danish population. However, Olsen et al. [
48] found 10 out of 2005 healthy adults to express
ETV6-RUNX1 at low levels. That is statistically more than would be expected if the incidence were equal to the leukemia rate (
t test,
P = 0.0019). This implies that adults carry the fusion at a higher prevalence than the leukemia rate. Therefore, it is safe to assume that the same is also true for children, even though it is not a proof of prenatal origin. Furthermore, we also screened UCB samples from Denmark and we were able to identify
ETV6-RUNX1 carriers [
7]. Hence, it is highly unlikely that the differences between the studies are a result of using samples from the Danish population, especially as the leukemia incidence in Denmark does not differ from the incidences of other European countries [
56].
The real discussion might not be whether the
ETV6-RUNX1 fusion is present in healthy newborns but at what frequency. Originally, Mori et al. [
6] reported frequencies of 10
−4 to 10
−3, but those frequencies were not confirmed by later studies [
36,
37,
39,
46,
48]. The frequency in investigated adults was markedly lower, but that is in line with the reduced risk for
ETV6-RUNX1+ leukemia in adults [
48]. However, all studies confirming the presence of
ETV6-RUNX1 in healthy newborns that looked at the frequency found it to be much lower [
37,
39]. Lausten-Thomsen et al. [
46] initially found ~ 1% of
ETV6-RUNX1-positive samples with a frequency of ≤ 10
−5, therefore this study is very important in challenging the proposed frequency of the preleukemic cells. We also tried to address this in our study [
7], but the frequency we found can only be compared with the others under reserve. We used CD19
+-sorted cells and had a bias, because not all PCR products are amplified in the same way. Therefore, these numbers should be considered an estimate. Furthermore, we used DNA instead of RNA, so this must be taken into account when comparing the studies. In our study, we also confirmed the presence of
ETV6-RUNX1 by qRT-PCR in two cases [
7]. The frequency was ~ 10
−4, which would be more in line with the studies that found low frequencies.
2.3 TCF3-PBX1
The
TCF3-PBX1 fusion is the product of a balanced or unbalanced t(1;19) translocation and is among the most frequent aberrations in childhood ALL. It is especially common in Latin America [
51,
52] and among black children [
57], where as many as 11.8% of childhood ALL cases carry this fusion.
Unlike the aforementioned translocations,
TCF3-PBX1 has long been considered to only arise postnatally. Still, the fusion could be traced back to Guthrie cards by Wiemels et al. in two cases [
40]. In both cases, only one segment of the blood spot was positive for the fusion and the fusion points showed signs of site specificity and of terminal deoxynucleotidyl transferase activity, and so
TCF3-PBX1 was declared postnatal. The site specificity hints at aberrant V(D)J recombination. During fetal hematopoiesis, none or few nontemplate nucleotides are inserted, whereas this insertion is common in children and adults [
58‐
60]. However, in another backtracking study, one
TCF3-PBX1 patient could be traced back to the respective Guthrie card by screening for IgH rearrangements [
31].
Following our success with GIPFEL and
ETV6-RUNX1 [
7], we also looked for
TCF3-PBX1 in healthy newborns. In 2 of 340 (0.6%) cases, we were able to identify the fusion and also the exact fusion point [
9]. The presence of the
TCF3-PBX1 fusion in the UCB of newborns is definite proof of prenatal origin. It is, however, not clear if
TCF3-PBX1+ newborns will remain healthy throughout their lifetime. In the
ETV6-RUNX1 study [
7], 50/1000 (5%) were translocation positive, which gave the study enough statistical power to conclude that most of them will never develop leukemia. For
TCF3-PBX1, it is unlikely that both newborns will develop ALL but not impossible. Identifying the
TCF3-PBX1 fusion in healthy newborns could prove that
TCF3-PBX1 can arise prenatally but not that the frequency definitely exceeds the ALL incidence. The data from these studies paint a picture in which
TCF3-PBX1 can arise prenatally but possibly also throughout an individual’s lifetime. Studies investigating the frequency of
TCF3-PBX1+ ALL in children and adults found a slight decrease from 5% of ALL cases in children to 3% in adults [
5]. Thus, either
TCF3-PBX1 (1) always arises prenatally and can have a very long latency phase, or (2) it can also arise postnatally, explaining the mild decrease from childhood to adulthood.
2.4 BCR-ABL1
The
BCR-ABL1 fusion is the product of a t(9;22) translocation, widely known as the Philadelphia chromosome, which was the first ever to be described [
61]. The fusion of these genes can create three different proteins: p190, p210, and p230. Each of these differ in their
BCR breakpoints, with m-BCR (minor) leading to p190, M-BCR (major) to p210, and μ-BCR (micro) to p230 [
62,
63]. Classically,
BCR-ABL1 is present in adult chronic myelogenous leukemia (CML), where 90–95% carry the Philadelphia chromosome. Of these patients, over 99% express the p210 isoform. However, the fusion is also present in ALL. In the adult form, 25% have a t(9;22) [
5], with the majority also expressing the p210. In this entity, the p190 isoform is also prominently present. In pediatric ALL,
BCR-ABL1 plays a minor role, with only 3% of cases being positive for this fusion [
2]. It is of interest, though, that the p190 isoform is the predominant form in pediatric ALL, with 90% expressing this protein.
The p190 isoform was shown to arise prenatally in at least two pairs of monozygotic twins [
8]. In both cases, both twins had the identical breakpoint, indicative of prenatal origin. Also, in one twin pair, the fusion could be traced back to the respective Guthrie cards. Interestingly, in one pair of twins, only one twin developed leukemia [
8]. It is, of course, possible that the second twin developed ALL later in life, but it shows that secondary hits are necessary and that these hits are acquired postnatally. This also hints at the possibility that
BCR-ABL1 may arise prenatally in children who will never develop ALL. Additionally, the specificity of the isoforms, regarding the resulting leukemia subtype, indicates that the p210 isoform probably arises postnatally, especially when one considers that it is typical for CML, which usually arises later in life.