Prior to the study we received ethical clearance from the Ethical Research Committee of the Hasanuddin University and from the Ministry of Health of the Republic of Indonesia.
Data and sample collection
Clinical data were collected in June 2000. During an active door-to-door survey 88.3% of the population was examined for clinical symptoms of leprosy [
6]. The diagnosis was based on the WHO classification. Patients with one lesion were classified as PB1 and with 2–5 lesions as PB2-5; patients with more than five lesions and/or with a positive bacterial index (BI) in at least one of three skin smears were classified as MB. Persons who reported to have completed a full course of multi-drug treatment, without active lesions and skin smear negative, were marked as patients released from treatment (RFT).
At the same time blood samples were collected of the population above 5 years: 68.1% of the population. Serum was separated by centrifugation on the same day and kept frozen until use.
During two subsequent population surveys in April 2002 and April 2003 the parent names of the majority of the inhabitants were asked. Furthermore, during the survey in April 2002 interviews were held with elderly people and leprosy patients about their family structure and ancestors. With these data an extended pedigree was prepared. To determine the occurrence of inbreeding the kinship coefficient (the probability that two alleles, at a randomly chosen locus, one chosen randomly from individual i and one from j are identical by descent) was computed for parents [
20].
The longitudes and latitudes of every fifth house were measured using a hand-held Global Positioning System (GPS, Garmin, Kansas USA). In Arcview 3.2 (Esri, California USA) the remaining houses were situated between the geo-referenced houses using a detailed hand-drawn map. The resulting map was used to prepare a geographical distance matrix between all inhabitants.
Statistical analysis
Leprosy prevalence was defined as the proportion of the sum of leprosy patients and RFT patients over the population screened for leprosy in June 2000. Even though it is not common practice, for the purpose of this particular research question RFT patients were included in the prevalence. Seroprevalence was defined as the proportion of seropositive persons (including seropositive patients) over the population screened for antibodies.
was used to test clustering of leprosy
per se, MB leprosy, PB leprosy and seropositivity due to genetic, household and spatial effects. Here z
i is the outcome for subject i (1 if affected and 0 otherwise), π
i is the age and sex specific prevalence and R
ij is the genetic, household or spatial correlation for subject i and j. The specific R
ij are described below. In the simple case of π
i = 0.5 for all i, the statistic Q reduces to the sum over concordant pairs (i,j) (for example leprosy patient - leprosy patient) of R
ij minus the sum over disconcordant pairs (i,j) (leprosy patient - person with no leprosy) of R
ij. In general the statistic Q tends to be large when concordant pairs have higher correlations R
ij compared to discordant pairs. For the score test it is important to realise that healthy persons also provide information, although not as much as the patients. The distribution of Q under the null hypothesis of no correlation can be approximated by a chi-square distribution with scale parameter 0.5Var(Q)/E(Q) and degrees of freedom of 2E(Q)
2/Var(Q). Formulae for the expectation and variance of Q can be found in Houwing-Duistermaat et al [
23].
The correlation structures R
ij corresponding to the genetic, household and spatial effects were based on distances between individuals. For the genetic model correlation between pairs is based on genetic distances (d
g) in the pedigree; siblings have a higher correlation compared to cousin-pairs and unrelated persons have no correlation. Specifically R
ij = 1/2
dg which is equivalent to two times the kinship coefficient. In the household model the distances between individuals sharing the same household is zero which gives a correlation of 1, and distance infinite for inhabitants of different households (R
ij = 0). The spatial model is an extension of the household model. The distance for the spatial model (d
e) equals the distance between 2 households in metres. We used the following formula: R
ij = exp(-d
e
ij/44). In previous studies it was shown that apart from household contacts also first and second neighbours have an increased risk of developing leprosy [
5]. The number 44 still gives a good correlation between a house and its second neighbour: for d
e = 11 (the mean distance between a house and its nearest (first) neighbour) R
ij = 0.779, for d
e = 22 (the assumed distance between house and its second neighbour) R
ij = 0.607 and for d
e = 33 R
ij = 0.473. This last correlation is seen as a moderate correlation [
24]. Thus, the correlation decreases when the distance between 2 households becomes larger. We performed a sensitivity analysis in which the number 44 was changed into 33 and 55. Spearman rank correlation coefficients were computed between the correlations R
ij of the different random effects.
In the analysis for leprosy per se all patients detected in June 2000 and the RFT patients were included. These RFT patients were excluded from the separate analyses for MB and PB leprosy, because classification could not be confirmed. For leprosy per se the test was, apart from the total population, also performed on a subpopulation which was expected to have a relatively stable household status over the last 20 years, namely the population below 21 and above 39 years. From the data it was seen that up to the age of 20 84% (291/346) still lived in the same house as their mother, and that after the age of 20 this percentage was much smaller (12%; 25/214), indicating that most people moved when they were around 20 year of age. Interviews learned that most people move only once in their life, namely when they get married and move from their parental house into their own house. Persons aged 21–39 were excluded because it is expected that most of these persons had a change in household status within the last 20 years.
Heritability estimates were calculated for leprosy
per se and for seropositivity using a random effects model with a logit link and assuming Gaussian random effects [
25]. The heritability estimates are presented with two-sided 95% confidence intervals (95% CI). The confidence intervals were estimated using profile likelihood. Both the score statistic and the heritability estimates were adjusted for the covariates age (continuous) and sex.
Finally, the risk ratios for siblings (λs) for leprosy
per se and seropositivity, defined as the ratio of the risk of leprosy/seropositivity for siblings of affected persons to the risk for the general population, were calculated separately for the group under 21 years and the group of 21 years and older according to the method described by Olson and Cordell [
26]. Confidence intervals were calculated according to the method of Zou and Zhao [
27]. Different age groups were used because the risk of leproys/seropositivity for the general population (i.e. prevalence) differed between age groups.