Background
Methods
Search strategy and study selection
Data extraction
Data synthesis and quality assessment
Protocol and registration
Results
Confirmation of transmission
Journal article | How was threshold defined? | Cut-off | Sampling fraction | Lineages |
---|---|---|---|---|
Bryant et al. [30] | Own data | ≤6 SNPs relapse (same strain); >1,306 re-infection (different) | 47 sequenced out of 50 chosen | Four major lineages |
Clark et al. [23] | Unknown | <50 SNPs defined a cluster | CAS, LAM, EAI, T1, T2, Beijing, X1 | |
Guerra-Assunção et al. [29] | Own data | ≤10 SNPs relapse; >100 re-infection | 60 out of 139 WGS confirmed recurrences | Four major lineages |
Guerra-Assunção et al. [18] | Own data (transmission); Guerra-Assunção et al. [29] (relapse) | ≤10 SNPs confirmed transmission; ≤10 SNPs defined a relapse | 1,687 out of 2,332 had WGS | Four major lineages |
Kato-Maeda et al. [26] | Own data | 0–2 SNPs per transmission event | ||
Lee et al. [17] | Own data | 0–1 SNPs confirmed transmission | 631 ‘improbable’ transmission pairs—between outbreak cases and cases in other villages | Outbreak isolates were Euro-American lineage |
Luo et al. [16] | Walker et al. [21] | |||
Roetzer et al. [14] | Own data | 3 SNPs confirmed transmission | 31 out of 2,301 (for the threshold). Equivalent to eight transmission chains of 2–7 patients | Haarlem lineage |
Walker et al. [21] | Own data | ≤5 SNPs cluster; >12 SNPs no transmission | 303 out of 609 (for the threshold) | All five major lineages |
Walker et al. [22] | Own data | 475, 1,032 and 1,096 SNPs suggested that patients had been secondarily infected with a different strain rather than within-host evolution | Pulmonary vs extra pulmonary pairs from 49 patients and 110 longitudinal isolates from 30 patients | All five major lineages |
Witney et al. [19] | Walker et al. [21] |
Direction
Journal article | How was direction of transmission determined? |
---|---|
Didelot et al. [25] | Epidemiological data and WGS used in a Bayesian inference framework to construct a transmission tree |
Gardy et al. [12] | Social network analysis and contact tracing posed putative transmission, timing of infection and smear status was used to narrow down possible direction and WGS to remove transmission events involving cases with different lineages |
Kato-Maeda et al. [26] | Contact tracing and accumulation of SNPs |
Luo et al. [16] | Epidemiological links and timing of infection and symptoms helped propose direction of transmission between isolates in the same WGS-based cluster. Transmission of mutant alleles from case with mixed base calls |
Mehaffy et al. [13] | Genomic and epidemiological information (i.e. SNP pattern, contact information, year of diagnosis and infectiousness based on smear and chest X-ray results) |
Pérez-Lago et al. [31] | In one case direction was proposed by the transmission of mutant alleles from a case with mixed base calls |
Roetzer et al. [14] | Contact tracing revealed transmission chains and accumulation of variation is mentioned, although not clear if this resolved the order of the chain |
Schürch et al. [24] | Accumulation of SNPs |
Smit et al. [27] | Accumulation of SNPs and period of infectiousness |
Recurrences
Within-host diversity
Journal article | Mixed infections or microevolution | Definition of heterozygous base call |
---|---|---|
Bryant et al. [30] | Mixed infection | Mixed base positions were identified at sites where more than one base had been identified in a single sample, where each allele was supported by at least 5 % of reads (minimum read depth of four). Included only positions without strand bias (p >0.05), had coverage within the normal range, mapping quality score greater than 50 and base quality scores greater than 30. Sites within 200 base pairs of other heterozygous sites were discounted because of the possibility that they might have been caused by a mapping error. More than 80 heterozygous base calls defined a mixed infection |
Guerra-Assunção et al. [18] | Mixed infection | Sample genotypes were called using the majority allele (minimum frequency 75 %) in positions supported by at least 20-fold coverage; otherwise they were classified as missing (thus ignoring heterozygous calls). We excluded samples with >15 % missing genotype calls, to remove possible contaminated or mixed samples or technical errors |
Guerra-Assunção et al. [29] | Mixed infection | A position was classified as heterozygous if >1 allele accounted for ≥30 % of the reads (and there were >30 reads). More than 140 heterozygous positions in one sample classified as mixed infection |
Kato-Maeda et al. [26] | Mixed infection | Mixed infection was identified when there was a heterozygous base call: 38 % of reads supported the variant; the rest supported reference |
Luo et al. [16] | Microevolution | Kept only the calls in which the coverage was ten and the less frequent allele was supported by at least five high-quality reads, as reliable calls. Presence of mixed base calls could indicate microevolution in that patient |
Pérez-Lago et al. [31] | Mixed infection | Less frequent nucleotide was supported by five reads |
Walker et al. [21] | Microevolution | Suggestive of ‘sub-populations’; i.e. microevolution |
Drug resistance
Quality of studies
Discussion
Main findings: implications of analytical approaches on WGS inferences
Strengths and limitations
Comparison with recent reviews
Conclusions
Over-arching findings from included papers | Recommendations |
---|---|
Suggested SNP thresholds for evidence of transmission are heterogeneous and sensitive to the finding of epidemiological links, SNP calling protocols and culturing/sampling, thus potentially are not transferrable between settings and/or studies | When setting study-specific SNP thresholds consider the time between samples, mutation rate, evolutionary pressure the strain may have been subjected to, and the endemicity of strains. Consider alternative approaches for determining transmission, including Bayesian approaches |
The distinction between relapse and re-infection for repeated instances of TB disease has been made empirically (by examining the distribution of SNP distances between the initially infecting and subsequently infecting strains) | While existing thresholds appear adequate for clinical trials, consideration of epidemiological and clinical data is important, as well as a better idea of the within-host mutation rate when more accurate classification is required |
The lack of diversity within M.tb complicates the use of WGS for inferring transmission patterns (17/25 studies found identical samples). Recent case studies show that there may be more diversity that is not identified by commonly used WGS methods | Deep sequencing, multiple samples and looking at shared minor variants (mutations present at low frequencies) will enhance detection of diversity. Epidemiological data, and consideration of associated uncertainty due to missing contact information, will also be necessary |
Examining resistance-conferring mutations shared by phylogenetic clusters is a common method for identifying transmission of drug-resistant strains. However, phylogenetic clusters do not necessarily correspond to transmission clusters | Reconstruction of the transmission tree followed by an examination of the drug resistance patterns between linked individuals may be more appropriate |