Background
AIDS, caused by the retrovirus HIV, is predicted by 2030 to become globally the single largest cause of morbidity, as measured by disability-adjusted life-years [
1]. African countries currently have the highest disease burden of HIV, with 9.2% prevalence in Addis Ababa in Ethiopia and over 10% in Dar-es-Salaam in Tanzania, yet almost all genetic studies have focused on cohorts from Western countries [
2]. The genetic architecture of HIV susceptibility in Africans is likely to be different to Europeans, yet genome-wide association studies of host susceptibility to HIV have not yielded any significant results [
3]. These studies miss regions that show copy number variation, particularly structurally complex regions that are not correlated with alleles at flanking SNP markers [
4].
Copy number variation (CNV) is defined as the variation in copy number of a given DNA sequence in a diploid genome. CNV is common in the genome, affects gene expression, and involves immune response genes [
5‐
7], suggesting that it may affect susceptibility of the host to infectious disease. CNV of the killer cell immunoglobulin receptor genes has been shown to affect host control of HIV infection, as determined by the viral load (VL) at setpoint [
8], and we have recently shown association of β-defensin CNV both with HIV viral load at initiation of highly-active anti-retroviral therapy (HAART) and with consequent immune reconstitution [
9].
The genes
CCL3L1/
CCL4L1 encode the chemokines MIP-1α and MIP-1β which are both ligands for the chemokine receptor CCR5 used as a co-receptor by R5 strains of HIV. These genes show CNV, and this has been shown to affect HIV acquisition, progression to AIDS, and immune reconstitution following highly active anti-retroviral therapy (HAART) [
10‐
12]. An attractive model is that these chemokines and HIV compete for the same receptor CCR5, and that increasing copy number increases the levels of chemokine, thereby increasing competition with HIV for the receptor [
13]. A gene dosage effect linking gene copy number and protein levels is needed to support this hypothesis, and evidence has been contradictory. Early studies supported a gene dosage effect [
10,
11], but recent studies have suggested that the influence of extra gene copies on total protein levels is low [
14,
15]. A problem in these experiments is that the protein product of
CCL3 (called MIP1α-LD78α) and
CCL3L1 (MIP1α-LD78β) cannot be discriminated using standard antibodies. Thus analyses using antibody-based detection of protein products may not detect a gene dosage effect, particularly given the higher levels of
CCL3 transcription and presumably MIP1α-LD78α in the blood. Although both protein isoforms signal through CCR5, only the LD78β isoform can be cleaved by dipetidyl peptidase IV to generate a monocyte attractant and CCR1 agonist [
16,
17]. Indeed, functional evidence remains supportive: measuring the chemotactic response of cells to supernatants from lipopolysaccharide-stimulated monocytes from different individuals supports an effect of different
CCL3L1 gene copy number [
10]. However, other mechanisms for an effect of
CCL3L1 copy number can be invisaged, either directly or indirectly by affecting other immunological phenotypes such as the CD4+ cell count.
Attempts at replicating the genetic association of
CCL3L1 copy number and HIV susceptibility have yielded contrasting results. A meta-analysis of nine studies has supported an association of lower
CCL3L1 with susceptibility to HIV [
18], but this study did not critically analyse the quality of the published data used in the meta-analysis. For example, the use of quantitative PCR to determine
CCL3L1 copy number may generate false-positive associations [
19‐
21]. It may be that
CCL3L1 and
CCL4L1 do not always vary in copy number as a block, which might explain at least some of the heterogeneity in results when different methods are used to determine copy number. However, when more robust reliable methods are applied to large European cohorts there is no evidence of this, suggesting that when measured with sufficient precision and accuracy,
CCL3L1 and
CCL4L1 covary as a block [
22,
23]. In common with most of the literature, we refer to this copy number variation as
CCL3L1 copy number variation, but it should be remembered that it also involves
CCL4L1 and possibly
TBC1D3.
CCL3L1 CNV has also been associated with a variety of other infectious diseases, including tuberculosis [
24], hepatitis B [
25], hepatitis C [
26] and Kawasaki Disease [
27]. Such association studies are almost always small, use qPCR to type copy number, not necessarily replicated [
28], and in some cases the reported association is seen only on a background of a particular genotype at another locus. While such studies are based on reasonable hypotheses concerning the function and interaction of proteins and pathogens, the marginal significance levels and limited power of such studies means that drawing definitive conclusions regarding the role of genetic variation remains difficult. In the most technically- and genetically-thorough study to date, a weak suggestive association with protection from anemia in malarial infection was found, but this family-based study too lacked power to detect anything but strong effects [
29].
Evidence from other African studies of
CCL3L1 and HIV has been contradictory. In a small Zimbabwean longitudinal cohort, no association of
CCL3L1 copy number with HIV status or progression was found [
30]. However, analysis of mother-to-child transmission in South Africa suggested that higher copy number was protective against HIV transmission [
31]. In this context, we decided to analyse our previously described cohort of HIV patients from Ethiopia and Tanzania for association of
CCL3L1 copy number with viral load immediately prior to HAART and immune reconstitution during HAART. African populations are known to have a higher average copy number than European populations [
11,
31], due either to natural selection or genetic drift. This has the advantage, in an association study context, of providing a wider range of copy number and therefore a potentially larger gene dosage effect. However, there are significant technical challenges in accurately typing multiallelic copy numbers at this, or indeed other, loci. We decided to use the paralogue ratio test (PRT) to determine copy number, which is the most robust technique available for typing this locus on large cohorts [
19,
21].
Discussion
It has been observed previously that, despite HAART being effective at reducing HIV load to below measurable levels, CD4+ cell count does not always return to healthy levels [
36]. This might be due to a variety of factors, including host genetics and co-infection status. Indeed, we demonstrate in this study (Table
3) that both initial baseline CD4+ cell count and absence of TB have a positive effect on the CD4+ count following initiation of HAART, a commonly used measure of immune reconstitution. The role of host genetic variation in influencing different rates of immune reconstitution during HAART is not well understood, yet is of increasing importance as HAART programmes are initiated and continued in areas of high HIV prevalence. Several candidate genes have been suggested to play a role, including a haplotype of the
TRAIL gene and copy number variation of the β-defensin genes [
9,
37]. This study suggests that
CCL3L1 copy number has a stronger effect on immune reconstitution than β-defensins (β-defensin β = −3.63 CD4 + cells/ml per copy,
CCL3L1 β = −4.75 CD4+ cells/ml per copy). However, unlike β-defensin copy number, we find no effect of
CCL3L1 copy number on viral load during acute HIV infection, just prior to initiation of HAART.
Previous studies have used combined data from different ethnic groups, with very different
CCL3L1 copy numbers, with HAART started at different CD4 count thresholds. It might be argued that variation in ethnicity was a confounding factor, so that ethnicity rather than
CCL3L1 copy number per se, was responsible for the variation in immunological reconstitution. While in no way a genetically homogeneous cohort, a fact that we attempt to account for in part by using country of origin as a cofactor in our analyses, our study does not combine two dichotomous ethnic groups with very different
CCL3L1 copy number counts and different levels of access to healthcare [
12]. Our entire cohort is also completely naïve to antiretroviral therapy prior to initiation of HAART, unlike those previously studied [
12,
38].
Although we have taken care to ensure the optimum quality of our copy number typing, problems remain particularly in distinguishing higher copy numbers, which are frequent in sub-Saharan African populations. Part of this is technical, due to inherent noise in the assays used, and part biological, due to the variation in repeat structure apparent in certain populations. Both issues cannot be resolved easily without more extensive work on the nature and extent of structural variation at this locus in different populations, and we suggest that this should be a prerequisite before a comprehensive analysis of the clinical role of CCL3L1 copy number can be made. The Genome Reference Consortium has assembled a reference allele from sequencing BACs from a genomic library derived from a hydatidaform mole, which contains one copy of the CCL3L1 and CCL4L1 genes and is likely to represent the most common allele in Europeans (accession number GL383560.1). However we show here that the high-copy alleles characteristic of African populations are not necessarily simply related to the European alleles, and there is clearly a need for accessible physical remapping approaches that can be applied to a significant number of samples to fully characterise structural variation at this locus.
There are three other caveats in interpretation of our study. Firstly, although we control for co-infection with tuberculosis, which represents the major co-morbidity in these populations, we cannot rule out that the effect of
CCL3L1 copy number is indirect, via another infection, rather directly on immune reconstitution. Secondly, as stated previously, the copy number variation involves the genes for the chemokine
CCL4L1, and
TBC1D3, a protein involved in macropinocytosis [
39]. Although
CCL3L1 is the favoured candidate for mediating the effect of copy number based on the known functional role of the chemokine, a role for the other gene products should not be completely ruled out. Thirdly, we also cannot rule out an indirect effect of
CCL3L1 copy number mediated by an effect on CD4+ levels immediately after seroconversion, which have been shown to affect immune reconstitution [
40].
Acknowledgements
Thanks to Don Conrad for access to the Agilent 210 k arrayCGH data, Mark Jobling for access to ABI3130xl capillary electrophoresis platform, and the patients for participation in this study.
Funding
This work was supported by a United Kingdom Medical Research Council New Investigator award [grant number GO801123] and a Wellcome Trust project grant [WT087663] to E.J.H.; European & Developing Countries Clinical Trials Partnership [grant numbers CT.2005.32030.001, CG_TA.05.40204_005]; and the Swedish International Development Cooperation Agency/ Department for Research Cooperation [grant numbers HIV-2006-031, SWE 2007–270, VR 521-2011-3437]. Core facility funding was supported by the Wellcome Trust [grant number WT098051].
Competing interests
EJH has received grant funding from Pfizer Inc, which had no influence in the conception, design or analysis of this work, and no role in manuscript preparation or publication.
Authors’ contributions
EA and EJH concieved and designed the study. Experiments were performed by LOH, JB, RH, BF and FY. Data were analysed by EJH, LOH, JB, BF, FY and MV. Clinical data and patient samples were provided by EA, AH, EN, GY, WA, SM, OM, EM, MJ, FM and GA. All authors read and approved the final manuscript.