Background
Tuberculosis (TB), caused by
Mycobacterium tuberculosis(MTB), remains a significant public health problem worldwide. It is estimated that approximately one-third of the world’s population has been infected with MTB, and 1.8 million people die of this disease annually. Among the 22 high TB burden countries reported by the World Health Organization, China ranks second in the world with approximately 1.3 million new cases [
1,
2].
Recently, molecular epidemiology tools have been used to assess risk factors associated with recent transmissions [
3],to track infection transmission dynamics, to distinguish relapse or reinfection and to detect suspected outbreaks; therefore, these tools play a critical role in tuberculosis research and control. MTB molecular markers promoted the development of reproducible genotyping methods [
4], including insertion sequence 6110 (IS6110), restriction fragment length polymorphism (RFLP) typing [
5], spacer oligonucleotide typing (spoligotyping) [
6], single nucleotide polymorphism analysis [
7],mycobacterial interspersed repetitive unit variable number tandem repeats (MIRU-VNTRs) assessment [
8], large sequence polymorphism (LSP) typing [
9], and genome [
10] sequence analysis. IS6110-RFLP has been the gold standard for genotyping MTB since 1993, but this procedure is time consuming, technically demanding and labour intensive. This method also requires about one microgram of high-quality DNA. Moreover, the discriminating efficiency of this method is insufficient for strains harbouring low copy numbers of IS6110.The Beijing genotype strains exhibit highly similar RFLP patterns, and therefore, discrimination among them is difficult. In addition, rapid and inexpensive genotyping methods based on PCR, such as MIRU-VNTR and spoligotyping, have been effectively used to investigate the genetic relationships and epidemiological characteristics of Beijing strains.
Non-coding regions of the MTB genome contain a set of identical 36-bp direct repeats (DRs), which are separated by 35- to 41-bp unique DNA spacer sequences. Spoligotyping, a rapid and highly reproducible method, detects the presence or absence of DR loci [
11]. The results can be represented in a simple binary format that enables the construction of large-scale databases [
12].Therefore, it is considered the gold standard for identifying the Beijing family strains, which have lack spacers 1 to 33 and harbour spacers 34–43 in the DR region [
11,
13]. Unfortunately, spoligotyping remains less discriminatory, especially in regions with a high prevalence of Beijing isolates [
14].The discriminatory power is improved when spoligotyping is combined with VNTR.
MIRU-VNTR, a new PCR-based typing method, determines the size and repeated number of units in each locus by amplifying mycobacterial interspersed repetitive units. Easy operation, economical cost, reproducible results and high discriminatory power make it practical for routine use [
15], and the digital results from this method can be compared and exchanged easily between different laboratories [
16,
17]. Twelve-locus MIRU-VNTR has been widely used in most cases but has lower discrimination for the Beijing family [
18]. Nevertheless, the 24- and 15-locus sets effectively improved discrimination compared with the initial 12-locus set [
19].
In China, more than 80% of tuberculosis patients are in rural communities [
20]. Henan Province, the most highly populated province in China, has a significantly higher proportion of the population living in rural areas. Therefore, the epidemic situation of tuberculosis in Henan remains severe. The numbers of both TB and drug-resistant TB patients in Henan are larger than those in any other province, and tuberculosis and HIV co-infection make the bad situation worse. Thus, study of a
M.tuberculosis transmission model can help determine risk factors and improve contact tracing. Moreover, little was known about the genetic diversity of MTB in this region until now. The study is the first to use26-locus MIRU-VNTR, including the standard 24-locus and two other loci (ETRF and Mtub38) [
21,
22]},for assessments in Henan. In this study, we carried out spoligotyping and MIRU-VNTR to classify 668 representative strains from17 cities. The objective of this study was to assess the diversity of MTB circulating in Henan with higher discrimination and to analyse the probable association between drug resistance profiles and genotypes.
Discussion
The Beijing family genotype remains the predominant genotype in China [
14]; however, the proportion of patients carrying this genotype varies in different regions [
14]. Henan Province has a high incidence of tuberculosis, but little is known about the genetic background of
M. tuberculosis in this region. This is the first study to investigate the allelic diversity of MTB isolates in Henan Province using spoligotyping and MIRU-VNTR.
M. tuberculosis is divided into 162 clades according to the international spoligotyping database SpolDB4 [
26], and the Beijing family is regarded as the most important genetic pattern of the East Asian clade [
22]. The pattern is predominant in China, and its proportion is greater in northern China than in southern China. The prevalence rate of the Beijing family is higher in Beijing (82–92.6%) [
31,
33], Tianjin (91.7%) [
34], Tibet (96.3%) [
33,
35], Inner Mongolia (93.3%) [
36], Heilongjiang (89.5%) [
37],and Gansu(87.5%) [
38] but lower in Guangdong(25%) [
39], Guangxi (55.3%) [
35] and Fujian(54.5–55.1%) [
33,
40]. These results showed that 83.5% of the MTB isolates belonged to the Beijing family, indicating that this genotype was the most predominant genotype in our region, which was consistent with the results of previous studies. The higher prevalence of the Beijing family might be associated with customs influenced by climate [
14]. The non-Beijing family lineage included the T1, T2, T3, MANU1, MANU2, S, LAM3,S/convergent, LAM3/S and U genotypes, indicating genotypic polymorphism among MTB strains in this area. Among the non-Beijing family isolates, 59 were in the T1 family (53.6%), 20 were in the MANU2 family (18.2%), 10 were in the T3 family (9.1%), and 7 were in the T2 family (6.4%); these genotypes have also been observed in other regions of China, albeit at different proportions [
35‐
40]. This study also identified eight new spoligotype isolates. They were divided into different clusters and were derived from patients in different regions, indicating there might not show an epidemiologic relationship. The larger number of small gene clusters could potentially reflect a recent transmission. The higher rate of small genotypic clusters for the Beijing family suggested its predominance in recent transmissions.
The prevalence of the Beijing genotype is apparently higher worldwide, but very little is known about the reasons for its efficient transmission. Previous studies have suggested that this genotype is associated with drug resistance and shows increased virulence in animal models [
41] and enhanced reproductive fitness [
42]. Overproduction of polyketide synthase-derived phenolic glycolipid (PGL) by the Beijing family inhibits the release of pro-inflammatory cytokines, thus enhancing the infective success [
43,
44]. The Beijing family has a strong association with drug resistance, indicating that this family might be predisposed to acquiring resistance [
45] and thereby showing increased transmission of drug-resistant
M.tuberculosis. However, there has been a discrepancy in different research results for the non-Beijing family because this family includes various subtypes. There is a discrepancy in the relationship between the Beijing lineage and TB outbreaks in a variety of geographic locations [
23,
45‐
52].To analyse the relationship between the Beijing family and drug resistance, we compared the proportion of Beijing genotypes in different drug susceptibility profiles, and the results showed that resistance to all four first-line drugs was significantly higher in the Beijing family. The Beijing family isolates had a higher resistance rate to INH, RIF and MDR in Ukraine [
53]. Similarly, the Beijing family also had a close association with INH, RIF, SM and MDR resistance in Central Asia [
54]. In agreement with previously reported data, our data revealed that the Beijing genotype showed a greater correlation with MDR-TB phenotypes than did other non-Beijing genotypes. The long-term reciprocal co-evolution between host and bacterium might affect the prevalence of the Beijing genotype [
55,
56], thus, we estimated the correlation between the Beijing family and epidemiological features, including sex, age and treatment status. Previously, some studies revealed that the Beijing genotype strains are generally associated with young age [
57] and a higher rates of treatment failure and relapse than other strains [
58,
59], but in this study, there was no association between the prevalence of Beijing genotypes and gender, age or clinical treatment history of patients. Therefore, it is necessary to further explore the effect of demographic factors on the genetic diversity of
M. Tuberculosis with a larger sample size in our area.
Spoligotyping is an efficient genotyping technique that can classify the MTB lineage, but it cannot effectively distinguish Beijing family isolates due to lower discriminatory power [
35]. In this study, 668 samples were successfully classified into 35 distinct genotypes, including 10 clusters and 25 unique spoligotypes. Due to the low resolution of spoligotyping, we applied another typing method based on MIRU-VNTR to further phylogenetically analyse the molecular characteristics of these isolates [
35,
37]. It is important to choose the appropriate VNTR loci to identify the most prevalent cluster for the Beijing family. The classical 12-locus MIRU-VNTR set is a widely used molecular epidemiological approach to elucidate the phylogenetic diversity of MTB isolates, but it is not effective at distinguishing Beijing isolates. The 15-locus and 24-locus VNTR combinations have sufficient discriminatory power and are suitable for MTB genotyping, especially in areas where the Beijing family is prevalent. However, it is not necessary to utilize all 24 loci for genotyping MTB isolates due to the diversity of the isolate population structure. In addition, the 24-locus set is very time consuming and complicated to operate. The genotyping efficiency varied depending on the disparate loci sets in different surveyed areas. To identify a suitable locus set for classifying MTB in Henan, we first chose the 26-locus set to assess the 668 clinical isolates according to previous studies [
21,
22,
60].In total, 567 genotypes, forming 38 clusters, and 529 unique genotypes were obtained by 26-locus VNTR analysis with an HGDI score of 0.9984. For the Beijing family isolates, the clustering rate (16.12%) of the 26-locus set was obviously lower than that of spoligotyping (98.74%), and the cumulative HGDI value (0.998) of the 26-locus set was significantly higher than that of spoligotyping (0.042). Moreover, the clustering rate of VNTR was different between Beijing and non-Beijing families (16.13% vs. 2.7%), suggesting that the Beijing family may have more effective infectivity in Henan. Previous studies showed that the combination of MIRU-VNTR and spoligotyping could enhance the discriminatory power of MTB [
19,
61].Correspondingly, our data showed that the combination of spoligotyping and the 26-locus set VNTR finally classified all 668 isolates into 576 different patterns and had a lower clustering rate (13.77%) than that with 26-locus VNTR (Table
4).
Our analysis showed that 11 loci of the 26-locus MIRU-VNTR set were poorly discriminatory. Especially, the VNTR49 and MIRU02 loci did not improve the cumulative HGDI, indicating that these two loci were conserved, resulting in no power to discriminate different MTB isolates. In this study, the two largest clusters contained 15 and 14 strains, and the other clusters were composed of 2 to 9 strains. Among 29 isolates of the two largest clusters, 24 (82.7%) belonged to the Beijing family, three to the T1 family, one to MANU2, and one to the H3 family. Moreover, 10 (34.5%) of these 29 isolates were resistant to one or more drugs. However, it remains uncertain whether the patients infected with these strains had close contact with each other. Since the ability of the different locus sets to classify MTB was diverse in Henan, we needed to determine an optimal set that had a discriminatory power comparable to that of the 26-locus set. In consideration of labour and economic costs, we chose the top 10 loci combination, which had an HGDI value comparable to that of the classical 24-locus set (0.996 vs. 0.997), which was slightly lower than that of the 26-locus set (0.998). These data indicated that the 10-locus combination was useful and cost-efficient. Therefore, we suggest this 10-locus set as a potential first-line MTB genotyping method in Henan Province, especially for a large-scale molecular epidemiological survey.
Our data revealed some inconsistency between the results of spoligotyping and MIRU-VNTR for several MTB isolates. Beijing family isolates comprised a large proportion of the largest cluster by MIRU-VNTR genotyping; however,33 isolates with a non-Beijing spoligotype were found in this largest clade. Furthermore, another 19 strains showed the Beijing genotype spoligotype pattern but could not be distinguished by MIRU-VNTR (Fig.
1).This divergence might be due to mixed infection of two different TB isolates in these samples [
62].
Until now, no genotyping methods based on genetic markers have been able to completely accurately classify the Beijing family because there were always exceptional strains [
46]. Previous studies showed that different VNTR loci had varying discriminatory power for the Beijing and non-Beijing family genotypes [
21,
63]. In our study, only Qub11b had a higher discriminatory power for the Beijing family among 26 loci. However, six loci (loci qub-11b, mtub-21, miru-26, qub26, mtub-04 and miru-10) exhibited a higher discriminatory power for the non-Beijing family. Four loci (Mtub38, ETR-B, ETR-D and MIRU40) showed remarkable differences in allelic diversity between Beijing and non-Beijing genotypes, with the difference in the HGDI greater than 0.25.
There are several limitations of this study. First, the small sample size is a major limitation. Further in order to ensure greater reliability and representativeness of the findings, we should enlarge the sample for further observation in the future. Furthermore, the initial isolates taken by sputum were not collected when retreated tuberculosis patients were first diagnosed in this study; thus, we were unable to differentiate relapse and reinfection cases.