Background
The ability to mount specific immune responses depends on a highly diverse repertoire of T- and B-cell antigen-receptor molecules. The genetic diversity required for millions of distinct antigen-receptors is created by the somatic recombination and fusion of individual variable (V), diversity (D), and joining (J) gene segments in a process known as V(D)J recombination. During V(D)J recombination, genomic DNA is cleaved at the boundaries of individual V, D, and J gene segments and the intervening DNA removed or inverted; subsequently, the newly apposed gene segments are ligated to form the variable region exon of one of the four types of antigen-receptor genes (reviewed in [
1]). These recombination events are mediated by RAG-1 and RAG-2 in the form of a V(D)J recombinase holoenzyme that is directed to proper sites of cleavage by DNA motifs known as recombination signals (RS). RS are located at the boundaries of V, D, and J gene segments and defined by highly conserved heptamer- and less-well conserved nonamer sequences that are separated by non-conserved spacer regions 12- or 23-base pairs (bp) in length [
2‐
5]. Under physiologic conditions, V(D)J recombination follows the "12/23 rule" to assemble functional antigen-receptor genes,
i.e., cleavage and recombination occur only between RS with dissimilar spacer types.
RS-like sequences that are unassociated with V, D, or J gene segments have been identified in the genomes of mice and humans [
4,
6‐
18]. A subset of these cryptic RS (cRS) are located within the
Igh and
Igκ loci [
7‐
17]. cRS in the
Igh locus are embedded at the 3' end of V
H gene segments where they mediate V
H → V
HDJ
H replacement reactions [
7‐
10,
13,
15,
19]. cRS in the
Igκ locus are located within introns where they mediate inactivation of
Igκ alleles [
11,
17,
20‐
22]. With the exception of [
8], previous studies of V-embedded cRS have focused on the
Igh locus. V
H gene replacement mediated by V-embedded cRS can rescue the development of B cells bearing autoreactive receptors and has been described as a mechanism for the maintenance of self-tolerance [
7,
9,
22‐
24]. In fact, it has been argued that V
H cRS are conserved specifically to provide a mechanism for secondary rearrangements at the
IgH locus, as "secondary V
H to J
H [recombination] cannot work because V
H and J
H [RS] do not meet the [12/23] requirement for recombination and because D segments, the guardians of this rule, are deleted by the primary V(D)J recombination" [
7].
Previously, we conducted a global analysis of cRS across mouse V
H gene segments using a computational algorithm to predict the location and functional activity of V
H cRS; these predictions were then tested using a ligation-mediated PCR (LM-PCR) to detect V
H cRS cleavage in purified populations of mouse B-lineage cells recovered from murine bone marrow [
4,
25]. We discovered that not only are cRS conserved at sites distributed throughout V
H gene segments but also that V
H cRS are cleaved only during the pro-B cell stage of development [
25]. Both results are inconsistent with the paradigmatic view that functional V
H cRS are maintained to facilitate the rescue of autoreactive B cells that would otherwise be lost to the mechanisms of self tolerance [
7,
9,
22‐
24]. Our results suggested to us that V
H cRS may be conserved for other reasons [
25].
In contrast to receptor editing via V
H replacement, receptor editing at the
Igκ locus, takes the form of either secondary,
de novo Vκ → Jκ rearrangements that replace or invert primary VκJκ joins [
26‐
30], or more rarely, inactivating rearrangement with cRS that flank the Cκ exon [
21]. Secondary, de novo rearrangements are not only possible at the
Igκ locus, but highly efficient because of the locus' organization: Vκ gene segments are associated with 12-RS while Jκ gene segments are associated with 23-RS, removing the need for a D gene segment and allowing repeated, direct VκJκ rearrangements; Vκ genes are present in both orientations, resulting in many inversion rearrangements and conserving Vκ gene segments that lie between the rearranging Vκ and Jκ gene segments for subsequent rearrangements; The possibility for rearrangement at the
Igλ locus further increases the opportunity for editing.
A corollary of the argument that V
H cRS are conserved to provide a mechanism for secondary rearrangement at the
Igh locus [
7,
9] is that cRS would not be conserved within Vκ gene segments. Thus far, however, there have been no systematic attempts to search for cRS within Vκ gene segments, to determine the extent of Vκ cRS conservation, or to determine whether they are functional. Previous work searched Vκ sequence alignments for partial heptamer motifs (CACA) at a location within Vκ orthologous to the location of the 3' V
H cRS [
8,
9]. It was noted that 10% of the Vκ gene segments examined contain this partial heptamer motif [
8]. We extend this study using a computational algorithm that allows for systematic scanning of the full length of Vκ gene segments for complete cRS [
4,
6] and by showing that conserved Vκ cRS are cleaved.
To test the hypothesis that functional cRS are not conserved in Vκ gene segments, we conducted a global examination of mouse Vκ segments using the computational and experimental methods of our earlier study of V
H cRS [
25]. As in our study of V
H cRS, we find that Vκ cRS are present and cleaved at multiple, conserved locations in Vκ gene segments. These cRS are conserved across Vκ gene families and are cleaved during the small pre-B cell stage of B-cell development. This study is the first to show that cRS are conserved within Vκ gene segments, and that these cRS are cleaved in vivo. Our findings support the hypothesis [
25] that cRS are conserved in
Ig V gene segments for a purpose(s) unassociated with the maintenance of self-tolerance.
Conclusion
The adaptive immune system has evolved to generate a diverse antigen-receptor repertoire. One mechanism of somatic diversification is V(D)J recombination, a process that joins antigen-receptor V, D, and J gene segments by initiating double-strand breaks at RS flanking the gene segments (for a review, see [
1]). RS at locations other than the boundaries of V, D, and J segments have been identified at both the
Igh and
Igκ loci [
22,
23]. Until recently, cRS in the
Igh locus were thought to be limited to the 3' end of V
H gene segments where cRS can mediate V
H gene replacement [
7,
9,
22‐
24]. V
H gene replacement can participate in a form of receptor editing at the heavy chain locus, which otherwise is incapable of secondary rearrangements that follow the 12/23 rule [
7]. It has been proposed that the utility of receptor editing is sufficient to drive the evolutionary conservation of V
H cRS [
7]. There is mounting evidence, however, that at least some receptor editing is antigen-independent, and that the conservation of Ig V
H cRS may result from other selective pressures.
The earliest evidence that the regulation of V
H replacement is independent of BCR-specificity came from studies [
35‐
37] that demonstrated frequent V
H replacement in mice transgenic for non-autoreactive heavy chains. These data suggested that selection for V
H cRS includes the capacity for increasing BCR diversification, in addition to self-tolerance [
8,
35]. We subsequently showed that V
H cRS SE were detected only in pro-B cells, including the pro-B cells of μMT mice which can not assemble functional BCR [
25,
38]. Together, these results support the notion that V
H gene replacement may not be driven by the recognition of antigen.
Koralov et al. [
39] demonstrated that, in transgenic mice homozygous for nonproductive heavy-chain rearrangements, V
H replacement events are only three times more frequent than direct V
H to J
H joining, in violation of the 12/23 rule. These results demonstrate the inefficiency of cRS-mediated V
H replacement and beg the question:
How can such an inefficient mechanism for rescuing autoreactive B cells increase fitness sufficiently to maintain VHcRS conservation? If V
H cRS are conserved to mediate V
H replacement, shouldn't V
H replacement at cRS be much more efficient than rearrangements in violation of the 12/23 rule? The results of Koralov et al. [
39] suggest that while V
H replacement may be mediated by V
H cRS, their conservation is unlikely to result only from their role in V
H replacement.
Unlike the cRS associated with
Igh, the cRS previously identified in
Igκ loci were not embedded in Vκ gene segments but sited in the Jκ-Cκ intron and 3' of Cκ and mediated locus inactivation [
11,
17,
20‐
22]. The cRS located in the Jκ-Cκ intron are known as IRS (IRS1 and IRS2), while the cRS found 3' of Cκ is named the kappa deleting element (kde) in humans and RS in mice. For clarity, we reserve 'RS' for signals adjacent to V, D, and J gene segments, and refer to the signal 3' of Cκ in mice as RS
κ3.
The structure of the
Igκ locus allows for secondary Vκ → Jκ rearrangements. Thus, if antigen-driven receptor editing is the primary force behind conservation of V-gene cRS [
7,
9], Vκ gene segments should not be selected for embedded cRS. Fanning et al. [
8] noted the presence of a partial heptamer motif (CACA) in Vκ gene segments at a location orthologous to the 3' V
H cRS, but to date, there has been no systematic attempt to identify potential cRS at other sites within Vκ gene segments or to determine their function. The determination of cleaved cRS within Vκ gene segments is an important first step in identifying their physiologic role(s) and resolving the selective forces that maintain their conservation.
To determine whether the
Igκ locus contains active cRS embedded in functional Vκ gene segments, we conducted a computational scan for cRS in Vκ gene segments and evaluated their functionality using LM-PCR. Our results indicate that, despite the capacity for repeated secondary
Igκ rearrangements, functional Vκ cRS have been evolutionarily conserved. Vκ cRS are primarily conserved in an orientation (O2) opposite to physiologic Vκ 12-RS and have 23-bp spacers (Table
1 and Figure
1). This conserved orientation and spacer size mirrors our earlier demonstration that conserved V
H cRS are oriented opposite to physiologic V
H 23-RS and contain 12-bp spacers [
25].
As with V
H cRS, Vκ cRS are conserved at multiple sites in Vκ gene segments and across Vκ gene families. Although our genomic scan identified relatively few Vκ cRS at positions analogous to the 3' V
H cRS (nucleotide position 313, IMGT numbering) that mediate V
H replacement (Figure
1), we did observe two cRS SE at this location, both in Vκ2 gene segments (Table
2). Of the 10 unique cleavage events at Vκ-embedded cRS, 8 represent cRS SE ≥ 30 nucleotides upstream of complementarity determining region (CDR) 3 (Table
2). V gene replacement (Vκ → VκJκ) at one of these embedded cRS would result in substantially lengthened variable-region product that would be unlikely to produce a typically folded L-chain protein. The conservation of functional cRS at such sites in Vκ gene segments in a locus capable of secondary Vκ → Jκ rearrangements implies a function distinct from immunological tolerance.
cRS previously identified at the
Igκ locus (IRS1, IRS2 and kde/RS
κ3) mediate rearrangement events that inactivate the locus and may serve to ensure
Igκ allelic exclusion or activation of the
Igλ loci (reviewed in [
40]). Rearrangements between kde/RS
κ3 and IRS result in the deletion of Cκ and rearrangements between kde/RS
κ3 and Vκ RS result in the deletion of Jκ and Cκ [
21,
41]. It is possible that the O2 Vκ 23-cRS likewise participate in these inactivation rearrangements, as recombination between IRS and O2 Vκ 23-cRS would result in deletion or inversion of the Jκ gene segment cluster.
Inactivating rearrangements involving IRS and kde/RS
κ3 have been implicated in antigen-induced receptor editing (reviewed in [
22]), and Kiefer et al. [
42] observed RS
κ3 cleavage in IgM
- BM pre-B cells, IgM
low immature BM B cells, and in IgM
lowIgD
+ splenic T3/T3' B cells. Our results indicate that cleavage of O2 Vκ 23-cRS is confined to the IgM
-, small pre-B compartment (Figures
4 and
5). We conclude that either Vκ cRS SE are rare relative to RS
κ3 SE, or that Vκ cRS SE are not present in immature B cells (perhaps because the cRS themselves are not accessible) and, consequently, may be unrelated to antigen-driven receptor editing. In either case, despite their frequency and function, Vκ cRS appear to play a less significant role in antigen-driven genomic change than do IRS and kde/RS
κ3.
The similarities between the VH and Vκ cRS suggest that these DNA motifs are conserved for a common function. Both cRS types are conserved at multiple locations, and both are conserved with an orientation and spacer length opposite to the corresponding physiologic V-associated RS. Both sets of cRS are cleaved coincidentally with the physiologic RS in the same locus. That is, VH cRS are cleaved in pro-B cells and Vκ cRS are cleaved in pre-B cells. We consider below possible mechanisms for conservation of these V-gene cRS in the Igh and Igκ loci.
First, V
H and Vκ cRS could be conserved to inactivate the
Igh and
Igκ loci. If so, this inactivation might help to ensure allelic exclusion, as evidence indicates that V
H [
25] and Vκ cRS SE (Figure
5) do not depend on the generation of a functional B-cell receptor. Inactivation of the
Igκ locus would increase the proportion of λ-expressing B cells and could act to increase the diversity of the BCR repertoire. A similar argument cannot be made for the
Igh locus as there is no alternative locus. Furthermore, the frequency of IRS-to-kde/RS
κ3 rearrangements mitigates any need for V-embedded cRS for inactivation at the κ locus. Thus, we doubt that the selection pressure resulting from locus inactivation via V cRS cleavage is sufficient to result in conservation of the cRS.
We previously suggested that V-embedded cRS could function to form hybrid V gene segments thereby creating combinatorial diversity beyond that created through the combination of V, D, and J or V and J gene segments [
25]. While the results are controversial, there is evidence for such hybrid heavy chain V genes [
43,
44]. Given that both V
H and Vκ cRS are conserved in opposite orientation and with the complementary spacer length to physiologic, V-associated RS, we propose that V-embedded cRS may be conserved to recombine with physiologic RS to form hybrid V genes. Under this model, hybrid V gene formation would proceed by a two-step process. Recombination of an O2 Vκ 23-cRS to the same Vκ gene segment's physiologic RS would result in deletion of the intervening nucleotides and generation of a SJ intermediate. A second recombination event could then occur between the RS of the SJ and an O2 Vκ 23-cRS located at the same or a nearby nucleotide position in a downstream Vκ gene segment. This two-step rearrangement would be rare, but would result in a novel, hybrid Vκ gene segment of approximately normal length. In particular, utilization of O2 23-cRS located in FR2 would create CDR1 – CDR2 combinations not present in the germline.
An alternative hypothesis to the conservation of cRS for their recombinogenic potential is that the nucleotide sequences are conserved to maintain appropriate V region amino acid sequences, and the corresponding recombinogenic potential is a coincidence. We present evidence that the conservation of O2 cRS embedded in V
H and Vκ is not explained by the need to maintain V region amino acid sequences ([
25] and Figure
3). In V
H gene segments, the second, third, and fourth nucleotides of the 3' cRS (...
TGT G) encode the conserved Cysteine at amino acid position 104 (Cys
104), while the codon for the conserved Cysteine at amino acid position 23 (Cys
23) is not part of any known cRS. Cysteine is degenerately encoded, and we find that only 38% of Cys
23 are encoded by TGT [
25]. Ninety-eight percent of Cys
104 are encoded by TGT, however, providing evidence for selection pressure to maintain the recombinogenic potential of the 3' V
H cRS [
25]. Similarly, analysis of FR codons in Vκ gene segments shows that codon diversity at cRS is reduced relative to the maximum possible to a significantly greater extent than at any other FR site (Figure
3), a finding that implies stringent selection against synonymous nucleotide substitutions in the cRS. The absence of synonymous mutations is important given that the predicted recombinogenic potential of most conserved (116/128) O2 Vκ 23-cRS could be eliminated by a single, synonymous nucleotide substitution (data not shown). Of the remaining 12 cRS, the recombinogenic potential for 10 of them would be significantly reduced (>90%) by one synonymous nucleotide substitution (data not shown). Thus, while nucleotide substitutions in cRS motifs that eliminate efficient recombination without altering Vκ amino acid sequence are potentially frequent, they are rare or absent in the genome. We conclude that there is evolutionary selection for V
H- and Vκ-embedded O2 cRS.
Another alternative hypothesis to the conservation of V-gene cRS for their recombinogenic potential is that the cRS nonamers are conserved for nucleosome positioning. Consensus RS nonamers may contribute to nucleosome positioning and influence RS accessibility to the V(D)J recombinase [
45]. While the cRS nonamers may influence nucleosome positioning, this property is unlikely to explain conservation of V-gene cRS. First,
RIC scores are based on the complete cRS sequence, and above-threshold
RIC scores would not result from conserved nonamer motifs alone. Second, cleaved Vκ cRS (Table
2) do not contain consensus nonamers and lack the stretch of adenosine nucleotides thought to be responsible for nucleosome positioning [
45,
46]. Thus, it is unlikely that selection for nucleosome positioning motifs has resulted in the maintenance of functional Vκ cRS.
We provide the first exhaustive search using a rigorous method for cRS embedded in Vκ gene segments. We demonstrate not only that Vκ cRS are conserved, but also that they are cleaved in vivo. We show that the patterns of conservation for Vκ cRS are analogous to those for V
H [
25], namely that the V-embedded cRS are conserved with an orientation and spacer length opposite to that for V-associated RS in the same locus. We provide evidence that these V-embedded cRS are not conserved as a consequence of selection pressure to maintain V region amino acid sequence and explore several possible explanations for their conservation. While the role of these V-gene cRS is not yet clear, their conservation in both V
H [
25] and Vκ gene segments implies a substantial evolutionary benefit to their presence.