Background
The immense genetic variability of HIV-1 viruses is considered the key factor that frustrates efforts to halt the virus epidemic and poses a serious challenge to the development and efficacy of vaccines. Like other human positive-sense RNA viruses, HIV has a high mutation rate as a result of the error-prone nature of their reverse transcriptase (3 × 10
-5 mutations per nucleotide per replication cycle)[
1,
2]. This high rate of mutation coupled with the increased replication capacity of the virus (10.3 × 10
9 particles per day) [
3], allows for the accumulation and fixation of a variety of advantageous genetic changes in a virus population, which are selected for by the host immune response and can resist newly evolving host defense. Recombination is another potential evolutionary source that significantly contributes to the genetic diversification of HIV by successfully repairing defective viral genes and by producing new viruses [
4]. To date, HIV-1 viruses are classified into four phylogenetic groups: M, O, N and P, which most likely reflect four independent events of cross-species transmission from chimpanzees [
5‐
7]. The M group (for main), responsible for the majority of viral infection worldwide, is further subdivided into nine subtypes (A-D, F-H, J and K), among which subtypes A and F have been further classified into two sub-subtypes [
5]. Moreover, early sequencing studies have provided evidence of recombination between genomes of different HIV subtypes [
8,
9]. Such interclade recombinant strains are consistently reported from regions where two or more clades are predominant. Recombinant strains from at least three unlinked epidemiological sources, which exhibit identical mosaic patterns, have been classified separately as circulating recombinant forms (CRFs) [
10,
11]. Currently, there are more than 40 defined CRFs
http://www.hiv.lanl.gov that are epidemiologically important as subtypes [
12]. In addition to the known CRFs, a large number of unique recombinant viruses, which are called unique recombinant forms (URFs), have been characterized worldwide [
13]. Together, CRFs and URFs account for 18% of incident infections in the global HIV-1 pandemic [
12]. HIV-1 subtypes, CRFs and URFs show considerably different patterns of distribution in different geographical regions [
12,
14].
In Brazil, the number of persons living with HIV reached an estimated number of 730,000 cases at the beginning of 2008 (2008 Report on the Global AIDS Epidemic). Like in other European countries and in North America, HIV-1 subtype B is a major genetic clade circulating in the country. However, the existence of other subtypes such as F1, C, B/C and B/F, has been consistently reported [
15‐
23]. Data from recent studies of the near full length genomes (NFLG) of HIV have provided evidence of Brazilian CRF strains designated as CRF28_BF, CRF29_BF, CRF39_BF, CRF40_BF and CRF31_BC [
17,
24‐
26]
http://www.hiv.lanl.gov/content/sequence/HIV/CRFs/CRFs.html.
In 2006, Thompson and colleagues [
27] published two NFLG of similar BF1 mosaic viruses from patients in Rio de Janeiro 94BR-RJ-41 (GenBank: AY455781) and 99UFRJ-16 (GenBank: AY455782). Here, we describe the HIV-1 NFLG of an additional six isolates with similar BF1 mosaic genomes from patients without evidence of direct epidemiological linkage.
Discussion
In the present study, we have characterized six NFLG sequences that posses mosaic genomic structure identical to the previously described strains, 94BR_RJ_41 and 99UFRJ_16 with a genome of predominantly subtype F1 and the
nef-U3 overlap portion of the LTR of subtype B (Figure
1). Moreover, three additional full-length genome sequences, which were initially characterized as pure subclde F1, now clearly appear to harbor a small fragment derived from subtype B in their LTR in a position identical to the breakpoint reported in our sequences. In phylogenetic tree of the full length and subgenomic regions of F1 subclade segment, isolates F1.JP.2004.DR6082 F1.JP.2004.DR6190 (recovered from japanese patients), 94BR-RJ-41 and 99UFRJ-16 (recovered from patients residing in Rio de Janeiro) position outside the single cluster formed by isolates 01BR087 and all BF1 recombinants identified in this study, except 06BR FPS561 (recovered from patients residing in São Paulo) (Figure
2a&
3a). The discordant branching between
gag-pol and
env sequences of 06BR FPS561, 94BR-RJ-41 and 99UFRJ-16 isolates can be explained by the occurrence of another recombination events after the spread of their common ancestor. Generally, our results suggest that the 11 recombinant sequences were not the result of one, but at least three independent recombination events that produce similar simple recombinant structures. In particular, BF sequences isolated in Japan and Rio de Janeiro may have originated from different BF recombinant ancestors than those sequences isolated in São Paulo. Thus, by excluding all the isolates that branch out of the main cluster, we provide a total of 6 sequences (01BR087 and 5 sequences described in this study) that meet the formal requirement for assigning a new CRF46_BF1. Again, in the phylogenetic tree of the F1 subclade fragment, the two recently isolated Japanese strains (F1.JP.2004.DR6190 and F1.JP.2004.DR6082) formed a rigid subcluster with isolate 06BR FPS561 and branch outside the subcluster formed by the other five viruses described in this study, but still strongly position within the main Brazilian subclade F1 sequences. This result suggests that the viruses found in the Japanese patients share a distinct common ancestry originating in Brazil. It is possible that the heavy traffic of people from both countries across international borders could have facilitated the spread of these viruses in both countries.
Based on the criteria of inclusion of the samples in this study, we were able to show that the CRF46_BF1 accounts for 0.56% of the HIV-1 circulating strains in São Paulo, similar to the frequency of subclade F1 reported from this region [
28]. The apparently low prevalence of the CRF46_BF is ecological and may not be due to inherent properties of the virus itself but rather to the chance results of subtype B (a founder virus in Brazil), where it is introduced and consequently established into our HIV infected population before the new CRF and other subtypes are introduced.
Our analysis also showed that the recombination of subclade F1 with subtype B at the
nef-U3 overlap portion of the LTR appears to be a recurrent finding because it has also been found in CRF39_BF1 and other unique HIV-1 recombinants [
17,
25,
40,
41]. In HIV, the existence of recombinational hot spots is common given that they have been described in cell-free systems [
42] and exists in the dimer initiation sequence of the HIV-1 5'-untranslated region and some preferential sites across the viral genome [
43‐
46]. Several studies have demonstrated that RNA hairpin structures strongly correlate with recombination hotspots in various regions of the HIV-1 genome[
42,
43,
46,
47]. Thus, based on the later mechanisms, it is possible that hairpins promote recombination by hampering the RT during reverse transcription or direct interaction with template [
46,
48,
49].
The HIV-1 LTR region is composed of various cis-acting regulatory components needed for proviral DNA synthesis, integration of the nascent viral cDNA into the host cell genome, transcription and modulation of HIV genes expression [
50,
51]. Early reports showed that the LTR region is made up of three segments designated as U3, R and U5 [
52]. The U3 modulatory region entirely overlaps with
nef [
53] and is essentially required during reverse transcription for first template transfer and integration of the provirus into the host genome. Moreover, this region seems to regulate the transcription pathway of HIV viral promoters by directly or indirectly interacting with a large number of cellular proteins, including NF-AT, Ets-1, USF, AP-1, COUP and Sp1 [
54]. Thus, substitution through recombination of the
nef-U3 overlap portion of the LTR with that of a genetically different subtype, as in our isolates, may affect the binding of both cellular and viral transcription factors. In turn, this may influence viral transcription levels, potentially enhancing the propagation of a recombinant virus leading to the persistance of a circulating form.
Several studies reported successful results in inhibiting HIV-1 replication by using synthetic siRNAs targeting either viral RNA sequences or cellular mRNAs encoding proteins that are critical for HIV-1 replication [
55‐
58]. The study conducted by Yamamoto and his colleagues [
59] showed a considerable sustainable suppression of HIV replication and control of CC-chemokine production associated with
nef expression in HIV-1-infected macrophages following transfection of short hairpin RNA (shRNA) by a lentivirus vector system expressing HIV-specific shRNAs. These results allowed the authors to conclude that lentivirus-vector-based RNA interference of the U3-overlapping region of HIV-1
nef may have potential usefulness as a genetic vaccine against HIV-1 infection. Furthermore, Ludwig and collaborator [
60] proved that HIV-1 contains an antisense gene in the U3-R regions of the LTR responsible for both an antisense RNA transcript and proteins. This antisense transcript has tremendous potential for intrinsic RNA regulation because of its overlap with the beginning of all HIV-1 sense RNA transcripts by 25 nucleotides. The novel HIV antisense proteins encoded in a region of the LTR that has already been shown to be deleted in some HIV-infected long-term survivors and represent new potential targets for vaccine development [
60,
61].
Given the biological relevance described to the U3 region, it is probable that the intersubtype recombination in this region could play an important role in HIV evolution with critical consequences for the development of efficient genetic vaccines.
During phylogenetic analysis, the B fragments of our six strains and the other five strains (marked with a triangle symbol in Figure
2b), which showed identical mosaic genomic structures, were clearly distinct from available South American subclade F1 sequences, particularly of Brazilian origin. This result coupled with the absence of the 13-15 nucleotides insertion downstream of the NF-κB
III binding site, which is typical for subclade F1, agrees with the interpretation that the segment at the
nef-U3 overlap portion of the LTR of the eleven isolates originates from subtype B. Unlike the marked clustering of the eleven isolates in the tree generated from the F1 fragment, the tree of fragment B depicted in Figure
2b shows them to fall in different sub-branches within subtype B reference sequences. This result is most likely explained by the short lengths of the fragment B sequences.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
SS conceived and designed the study, did the data analysis of the sequences, and wrote the manuscript. ÉRP, WKN and VPM conducted the characterization of the full-length genome analysis. ECS designed, wrote the manuscript and directed the study. All authors read and approved the final manuscript.