Study area
This study was conducted in Antananarivo, the capital city of Madagascar. Antananarivo is the most densely populated city in Madagascar, with a population of 1,114,346 spread across 90 km
2, giving a population density of almost 8687 inhabitants/km
2. The city counts six administrative districts, comprising a total of 192 neighbourhoods, and is located at an altitude of 900 m to 1500 m and has a high-altitude tropical climate [
18], with two seasons: a hot rainy season from April to October and a cold dry season from May to September. The average annual temperature is 18°C, with a maximum of 26°C (in November) and a minimum of 10°C (in July).
The population is served by three university teaching hospitals (Centre Hospitalier Universitaire (CHU)), 105 health centres and 16 Tuberculosis Diagnostic and Treatment Centres (DTC). The city has a good health service coverage but the neighbourhoods are considerably heterogeneous in terms of socio-economic conditions with some of them subjected to overcrowding, substandard housing conditions and unemployment.
Epidemiological data sources
The 16 Tuberculosis DTC of the city provided the data for TB new cases. TB cases are registered in a routine information system as part of the official TB control program. The TB registry contains information on the patient's place of residence, treatment follow-up and status (new cases, retreatment, treatment failure, and relapse after the completion of treatment) and treatment outcome (recovery, completion of treatment etc). All cases were followed up during treatment, and new cases underwent bacteriological check-ups at 2, 5 and 7 months. Cases of pulmonary tuberculosis were defined as patients presenting a cough lasting for more than three weeks and confirmed by a positive sputum smear. All new cases recorded in DTC registries from 2004 to 2006 corresponding to patients resident in the city of Antananarivo were included in this study. Approval for this study was obtained from the National Ethics Committee of the Ministry of Health of Madagascar.
Bayesian approach
Combination of a Bayesian approach and a generalized linear mixed model (GLMM) was used to assess spatial heterogeneity in the TB standardized incidence ratio (SIR) and to investigate associations between the three- year average TB incidence rates and the following five variables: number of patients undergoing retreatment (X
1i
), number of patients with treatment failure and those suffering relapse after the completion of treatment (X
2i
), number of patients stopping treatment within two months (lost to follow-up) (X
3i
), number of households with more than one case (X
4i
), and distance from the patient's residence to the DTC (X
5i
). All these X
ki
(with k = 1, 2, ..., 5) , calculated for each neighbourhood "i" (i = 1, 2, ..., 192), were obtained from TB registries and incorporated into the national TB control program. These X
ki
were chosen as explicative variables because they are indicator of healthcare system and, to some extent, they carry more or less information on the socio-economic, hygienic status of the neighbourhood and therefore on the likelihood of TB transmission. The X
5
is considered as a distal variable while the all others X's as proximal variables for the TB transmission. For instance, as the TB is transmitted by close contacts between infectious peoples and susceptible ones, the number of households with more than one case can be informative of the population density, or one may wonder whether new cases are recruited from populations among which we find patients undergoing retreatment and/or with treatment failure. Likewise, the number of patients stopping their treatment may be indicative of their socio-economic conditions and/or education level, or living far away from the DTC could turn out to be penalizing for accessibility of health care facilities and thus discouraging patients to complete their treatment.
This study was conducted at the neighbourhood scale. For each neighbourhood "i" (i = 1, 2, ..., n), the expected number of new cases εi was estimated as the mean new case rate over all districts multiplied by the population of the neighbourhood (i.e., εi = mean new case rate × popi), and the standardized incidence ratio (SIR) λi of each neighbourhood "i" was calculated as the observed number of new cases divided by the number of expected cases. Within the Bayesian framework, the observed numbers of new cases y = (y1,..., yn) in the n neighbourhoods were treated as non-independent Poisson random variables with means μ = (μ1,..., μn), where each μi is given by μi = εi × λi, or, in the logarithmic form, log(μi) = log(εi) + log(λi). The SIR λi is a function of the explicative variables X
ki
that account for differences and spatial heterogeneity in the disease rate: λi = exp(β0 + β1
x
1i
+ β2
x
2i
+ β3
x
3i
+ β4
x
4i
+ β5
x
5i+
θ
i
+ ν
i) with x
ki
= X
ki
/SDk, where SDk is the standard deviation (over all neighbourhoods) of each variable.
For the βk we assumed non-informative Gaussian prior distributions with a mean of zero and a precision of 10-5, whereas β0 was assumed to have a flat distribution. In this context, ν
i is a non-spatially structured random effect, assumed to have an independent Gaussian distribution of zero mean and variance σ2
ν following an inverse Gamma distribution as 1/σv ~ dgamma(0.5, 5 × 10-4). This effect was generally included in the models to account for extra-Poisson variation due to important explicative variables that were not measured. The spatially structured random effects - θ = (θ1,..., θn) - accounted for the spatial dependence, with the prior distribution taken as a conditional intrinsic Gaussian autoregressive model, in which the mean value for θi is a weighted average of the neighbouring random effects and the variance, σ2
θ following an inverse Gamma distribution of the form 1/σθ ~ dgamma(0.5, 5 × 10-4), controls the strength of this local spatial dependence, p(θi/θj≠i)~N(∑
j≠i
w
ij
θi/∑
j≠i
w
ij
, σ2
θ/∑
j≠i
w
ij
). As in most studies based on areas, we defined "neighbourhoods" as adjacent census tracts with simple binary adjacency weights, i.e. w
ij
= 1 if areas i and j share a common boundary and w
ij
= 0 otherwise.
These prior probability distributions and the likelihood of the data were updated and used in the Bayes' relation to obtain posterior distributions for the SIR [
19]. The parameters were estimated by Markov chain Monte Carlo methods, using the public domain software package WinBUGS (Cambridge, UK) [
20]. Two Markov chain Monte Carlo simulations were carried out in parallel, with different initial values, for parameter estimation. The time series plot for each parameter and Gelman-Rubin statistics showed that convergence occurred within 6,000 iterations. Thus, the inference of parameters was based on 20,000 iterations of both chains after the burn-in phase of 10,000 iterations. Each neighbourhood SIR was then input into a Geographic Information System for mapping. When investigating whether the posterior neighbourhood incidence rates were significantly higher or lower than the average rate, we have defined low risk (LR) and high risk (HR) neighbourhoods as follows. A HR neighbourhood was considered as having a rate significantly greater than the mean when the SIR was higher than 95% of iterations from the posterior distribution and SIR > 1. Likewise, a LR neighbourhood was considered as having a rate significantly lower than the mean when the SIR was higher than 95% of iterations from the posterior distribution and SIR < 1. In all other cases, the neighbourhood rate was considered to be not significantly different from the mean rate.