Background
Salmonella is a bacterial pathogen that causes foodborne illnesses worldwide. It is estimated that more than 1.2 million cases of salmonellosis are reported in the United States annually, resulting in 23,000 and 450 cases of hospitalization and death, respectively [
1]. Among the 2500 serotypes of
Salmonella spp.,
S. Tennessee is rarely isolated and is responsible for <0.1% of
Salmonella infections [
2]. However, in 2006–2007, a large and nationwide outbreak of
S. Tennessee infections occurred in the United States, and the consumption of contaminated peanut butter was found to be strongly associated with this outbreak [
3,
4]. The outbreak lasted for over a year, leading to approximately 715 reported cases in 48 states [
5]. While most infected people had gastrointestinal symptoms, including diarrhea, fever, and abdominal pain, more than a third of them had a urinary tract infection [
4,
5]. Urinary tract infection caused by
Salmonella species is rare, and some researchers presumed that this may be related to the prolonged survival of
S. Tennessee in the environment, highlighting the necessity of molecular subtyping to detect outbreak-related strains from the environmental background [
5,
6]. Several studies have revealed the high virulence and survivability of
S. Tennessee strains [
7‐
10]. In addition, because peanut butter has a long shelf life, contamination might result in
S. Tennessee infections over the long term.
S. Tennessee was identified from unopened peanut butter during another peanut butter outbreak caused by
S. Typhimurium in 2009, indicating that sporadic cases of
S. Tennessee infection may have occurred upon the consumption of contaminated peanut butter by individuals who did not know of the peanut butter outbreak [
11].
Several molecular-based techniques are used to differentiate and identify the relatedness of
Salmonella species. Pulsed-field gel electrophoresis (PFGE), a well-known molecular typing method, has been used as the “gold standard” for subtyping
Salmonella spp. The peanut butter outbreak-associated
S. Tennessee strains have the unique CDC PulseNet PFGE profiles of
XbaI patterns JNXX01.0010, JNXX01.00011, and JNXX01.0026, which were used to determine their association with this outbreak [
5]. However, PFGE is a labor-intensive technique requiring more than 2 days to perform. In addition, the PFGE technique does not always optimally discriminate the bacterial strains, especially closely related strains [
12]. To overcome these disadvantages, several molecular subtyping methods, including multi-locus variable-number tandem repeat analysis (MLVA) or multi-locus sequence typing (MLST), were adapted for differentiating
Salmonella serovars [
13,
14]. Despite the many advantages of these techniques, MLVA was found to be less effective for long-term epidemiological studies owing to the instability of some loci that evolve quickly [
15,
16]; furthermore, the usefulness of MLST for the investigation of outbreaks is controversial owing to the limited number of mutations within the housekeeping genes used for the MLST study [
17,
18]. As an alternative technique, a single-nucleotide polymorphism (SNP) method was introduced. SNPs located in the bacterial genome, and selection of multiple loci from genes with high polymorphism, including genes associated with quinolone resistance or flagella antigen, can be used to discriminate the genetic relatedness in a bacterial population and trace the evolutionary origin of a bacterial species. With this advantage, the SNP-typing method is often used to investigate the epidemiology of an outbreak and the mutational events for tracing the temporal and geographical origin of particular bacteria [
12,
18]. To date, only a few SNP-typing methods have been developed for
Salmonella spp. [
19‐
21]. The development of novel SNP-typing tools would play an important role in identifying unrelated stains of
Salmonella spp. [
12].
In this study, an SNP-typing method was developed for S. Tennessee to determine the clonal subtypes of S. Tennessee that were associated with the peanut butter outbreak. In addition, SNP markers were applied to isolates in order to evaluate the genetic relatedness of S. Tennessee strains isolated from various sources. Finally, the minimum set of SNP markers required to determine clonal S. Tennessee strains more rapidly and cost-effectively was identified.
Methods
Procurement of S. Tennessee strains and epidemiological data
A total of 176
S. Tennessee isolates from humans, animals, food, and the environment were procured from eight institutes located in Minnesota, Michigan, Indiana, Tennessee, New York, Iowa, Pennsylvania, and Calgary (Canada). Of the
S. Tennessee isolates, 131 were obtained from five state Departments of Health in the United States, and epidemiological data, including age, sex, isolation date, and PFGE results, were collected for the human isolates, when available. Forty-five
S. Tennessee stains from diverse animal and environmental sources were procured from three institutions (University of Pennsylvania,
Salmonella Reference Center; University of Calgary,
Salmonella Genetic Stock Center; and the National Veterinary Service Laboratory, Ames). Outbreak-associated
S. Tennessee stains were defined as those causing onset of illness or isolation during the period from Aug, 01, 2006 to Jul, 31, 2007, and having PFGE profiles of JNXX01.0010, JNXX01.0011, or JNXX01.0026 [
5] (Table
1).
Table 1
Information of strains used in this study
Human (114)d
| Stool (60) | IN (7), MI (17), MN (34), NY (19), TN (37) | Yes (81) | Yes (67) | Yes (64) |
Urine (32) | | No (32) | No (7) | Suspected (20) |
Wound (2) | | Unknown (1) | Unknown (40) | No (30) |
Unknown (20) | | | | |
Food (17) | Peanut butter (7) | MN (13), TN (2), UC (2) | Yes (13) | Yes (9) | Yes (7) |
Dried powdered eggs (6) | | No (2) | Unknown (8) | Suspected (8) |
Ground beef (1) | | Unknown (2) | | No (2) |
Fish meal (1) | | | | |
Unknown (2) | | | | |
Environment (8) | Feed (1) | MN (2), UP (6) | Yes (2) | No (2) | No (8) |
| | Unknown (6) | Unknown (6) | |
Unknown (7) | | | | |
Animal (37) | Avian (24); chicken, chukar, pheasant, turkey, etc. | NVSL (23), UP (14) | Unknown (37) | Unknown (37) | No (37) |
Ruminant (10); alpaca, cattle, deer, goat | | | | |
Swine (3) | | | | |
Selection of representative strains from various sources for the identification SNP markers
To select epidemiologically diverse
S. Tennessee strains from humans, animals, food, and the environment, 60 isolates of
S. Tennessee were selected based on diverse PFGE patterns and unrelated epidemiologic information considering factors such as time of isolation and source. These selected isolates were then further screened by using MLST and VNTR as described below to select representative
S. Tennessee strains. MLST was performed on seven housekeeping genes,
thrA, purE,
sucA,
hisD,
aroC,
hemD, and
dnaN, which were derived from the
Salmonella MLST database (
http://mlst.warwick.ac.uk/mlst/dbs/Senterica). Phylogenetic analysis was performed by pairwise comparison of the nucleotide sequences of these seven MLST genes to illustrate the neighbor-joining tree. For the VNTR analysis, tandem repeats of locus SE5 were analyzed using previously designed primers [
14].
Identification of SNPs
To identify SNPs, the sequences of three representative S. Tennessee strains, MN25, TN32, and MN47 were compared. The three strains, which represented different MLST and VNTR types, were selected from 60 diverse S. Tennessee strains. The genotypic and epidemiologic features of the three strains were as follows: (i) MN25: outbreak-associated strain, isolated from peanut butter during Feb 2007, PFGE pattern of JNXX01.0011, major MLST type, and allele 14 by VNTR; (ii) TN32: outbreak-associated strain, isolated from patient urine during Mar 2007, PFGE pattern of JNXX01.0026, major MLST type, and allele 13 by VNTR; and (iii) MN47: non-outbreak-associated strain, isolated from patient stool during Jan 2008, PFGE pattern of JNXX01.0049, minor MLST type, and allele 8 by VNTR.
Purified DNA from a strain isolated from peanut butter was submitted to the Genomic Core of the Research Technology Support Facility (RTSF) at Michigan State University for pyrosequencing using the 454 GS-FLX Titanium platform. Genome assembly of the produced data identified 66 gaps (range 0.2–1.8 kb) within 14 scaffolds that covered 4.8 Mb of the genome. Assembled sequences were deposited in the Genome Project database (NCBI accession number: PRJNA 46571).
The sequences of three representative strains, MN25, TN32, and MN47 were compared. Two sequence sets, MN25 from this study and the
S. Tennessee strain CDC07-0191, were aligned using the NUCmer version 3.07 alignment program [
22], which revealed that the two strains were nearly identical at the genomic level, having <0.005% (1/20,000) SNPs. The shotgun reads from TN32 and MN47 were then aligned to the MN25 sequence using the Roche 454/GS Reference Mapper program, version 2.0.01.14 (Madison, WI, USA). Putative SNPs were generated by comparison of the consensus contigs to the reference genome (MN25).
Application of SNP typing methods to the clinical isolates
The newly detected SNP markers were applied to the human, animal, food, and environmental isolates for evolutionary and molecular epidemiological analyses. The nucleotide diversity (pi, π) was calculated using Nei’s diversity index to measure the degree of polymorphism of each marker within the S. Tennessee isolates. A phylogenetic dendrogram for SNP subtypes was computed by using the unweighted pair group method with arithmetic mean (UPGMA) analysis for categorical value, and a minimum spanning tree (MST) was constructed using BioNumerics version 6.6 (Applied Maths NV, Belgium). To identify the minimal set of SNP markers required to determine clonal S. Tennessee strains, SNP markers having higher nucleotide diversity (π > 0.09) were first selected, and then representative markers (minimum SNP set) were randomly selected from among a set of markers with the same profile. The MST was constructed for the 176 isolates with this minimum SNP set.
Discussion
Prior to 2006,
S. Tennessee was not a common
Salmonella serovar, resulting in a relatively small number of
S. Tennessee infections worldwide. Only one outbreak of
S. Tennessee infection was reported to the United States (US) Centers for Disease Control associated with contaminated powdered milk products and infant formula [
23]; in contrast, most cases of
S. Tennessee infection were sporadic with unknown sources. However, after the multistate peanut butter outbreak of
S. Tennessee in the US, several
S. Tennessee-related outbreaks have occurred in humans, animals, and environments, revealing the persistent contamination of
S. Tennessee strains across various sources [
24‐
26]. In addition, a recent report on the association of
S. Tennessee infection between babies and reptiles highlights the importance of
S. Tennessee as a zoonotic pathogen [
26]. To cope with the increase of
S. Tennessee infection cases, an SNP-typing method was developed to evaluate the epidemiology of the peanut butter outbreak, and ultimately, to identify the mutational events of
S. Tennessee strains.
The comparison of three representative
S. Tennessee strains identified numerous SNPs, most of which were sSNPs. While synonymous mutations are considered as being neutral, causing minimal effect on the organisms, non-synonymous mutations sometimes lead to functional changes that may provide a positive selection for the pathogen toward spreading infections [
27,
28]. Some nsSNPs were found to be associated with bacterial colonization or host specificity [
28,
29]. In this study, one SNP marker (marker number 9) was found to be an nsSNP that replaced the amino acid glutamine with a stop codon. This marker is allocated within
ompC, which encodes a major outer membrane protein. In a previous study, it was found that
ompC was genetically stable in all tested
Salmonella serotypes except
S. Arizonae [
30]. However, this SNP was observed in two
S. Tennessee strains in the current study. While some studies have reported the detection of a higher proportion of sSNPs than nsSNPs [
31], consistent with our study, the opposite phenomenon appears to be more common in highly clonal organisms [
19,
32,
33]. Although the significance of this phenomenon has not yet been established [
32,
34,
35], sSNPs remain useful markers for investigating the genetic characteristics required to trace evolutionary origin [
12,
20].
Application of the 84 SNP markers (selected from three strains) for the comparison of the 176
S. Tennessee strain isolates revealed relatively low genetic diversity, with a mean nucleotide diversity of 0.049 ± 0.018, indicating that any two randomly selected isolates would differ by only 4.9% (Fig.
1). Generally, the nucleotide diversity of SNP markers is low, owing to the bi-allelic nature of SNP sites [
36]. However, the nucleotide diversity in the current study was lower than our expectation, which might be due to sampling bias. A symmetrical sample collection is important to evaluate the discriminatory power for subtyping [
21]. In the present study, the sample size was not sufficient for the evaluation of genetic diversity, because most of the human, food, and environmental samples were collected during or just after the peanut butter outbreak, which might cause the SNP analysis to not be representative of the entire spectrum of
S. Tennessee strains. In addition, the high clonality of
Salmonella spp. might contribute to lower genetic diversity. Minor genetic changes have been reported for
S. Typhimurium DT41 by MLVA [
37] and
S. Tennessee by PFGE and MLST [
38], indicating the overall genetic stability of
Salmonella species.
Following our MST analysis, while all outbreak-associated strains were included in clade 1, some non-outbreak-associated strains were also included. In contrast to subtypes 2, 3, and 5, which consisted of outbreak-associated or outbreak-suspected strains, subtypes 1 and 4 consisted of outbreak and non-outbreak-associated strains. Considering that two strains in subtype 4 were isolated from humans (Dec 2007 and Nov 2007) shortly after the peanut butter outbreak during the period from Aug 2006 to Jul 2007, late infection by
S. Tennessee outbreak-related strains might be possible. Non-outbreak-associated strains in subtype 1 mainly consisted of animal isolates. Although several
S. Tennessee strains were isolated from animals during the peanut butter outbreak, the animal isolates used in this study did not include outbreak-associated strains. Notably, the CDC records showed that
S. Tennessee isolates from chicken, porcine, and turkey sources were non-clinical, whereas bovine, turkey, other animals, and environmental sources were clinical, suggesting the possibility of chicken as an asymptomatic carrier of
S. Tennessee strains [
39]. In addition, two non-outbreak-associated strains in subtypes 1 and 4 were also isolated from poultry, implying a close relationship between the human and poultry isolates.
The results of the two subtyping methods, PFGE and SNP, were compared. While all the strains exhibiting the outbreak-related PFGE profile JNXX01.0010 belonged to subtype 1, the strains showing the PFGE profiles JNXX01.0011 and JNXX01.0026 belonged to a total of four and three subtypes, respectively, indicating the high discrimination power of the SNP typing method. On the other hand, subtypes 1, 3, and 4 consisted of strains with more than two kinds of PFGE profiles, indicating that neither method was sufficient to discriminate highly clonal
S. Tennessee strains. Considering that single-nucleotide diversity at restriction enzyme sites results in three-band differences, one- or two-band differences among outbreak-related PFGE profiles suggest that the
S. Tennessee strains are genetically stable [
40].
Identification of minimal SNP marker sets can be beneficial for the rapid and economical determination of strain types. In the current study, a minimum set of 18 SNP markers was determined; these markers classified the 176 isolates into seven subtypes. While the 84 SNP markers generated nine subtypes, one marker that contributed to the generation of a subtype was found to be a singleton, and was excluded from the minimum set. Nevertheless, this minimum set of SNPs could likely be utilized to genotype S. Tennessee strains more rapidly and cost-effectively, and with similar discriminatory power as that of the complete 84 SNP panel.
Investigation of the outbreak of foodborne bacterial diseases using sequencing-based molecular typing is relatively new, and this approach will aid the investigation of the epidemiology and microevolution of pathogenic bacteria by discriminating between outbreak-related and sporadic clinical cases. In addition, this approach enables us to understand the population structure of the bacterial subtypes involved in the outbreak. While our method does not have direct applications in the clinical setting, we believe that this study would help identify the evolutionary origin of an outbreak.
Authors’ contributions
SC and AMS conceived and designed the study. DB, SR, FD, JL, JG, and ME provided the S. Tennessee isolates and epidemiological information. SC carried out the experiments, and HJD and SC interpreted the data. HJD was a major contributor in writing the manuscript. All authors read and approved the final manuscript.