Clostridium difficile is an obligate anaerobic Gram-positive bacillus carried asymptomatically in the gut of approximately 2 to 7% of healthy human adults (
28,
35). Nosocomial acquisition of
C. difficile in humans is common, and symptoms ranging from mild diarrhea to severe pseudomembranous colitis can develop during antibiotic treatment or shortly afterwards (
1,
28,
40). Symptoms are caused by toxins A and B encoded by the
tcdA and
tcdB genes located within the pathogenicity locus (PaLoc) and potentially an additional binary toxin (reviewed in reference
7).
The rate and severity of nosocomial infections increased between the years 2000 and 2008 (
27,
46,
47), coincident with the emergence of a hypervirulent fluoroquinolone-resistant clone, designated PCR ribotype 027 (
31,
34). Outbreaks of
C. difficile infection (CDI) caused by the 027 clone have been reported in North America and throughout Europe (
15,
18,
23,
31,
32). This strain is now endemic, causing 36% of cases in England, United Kingdom, from April 2008 to March 2009 (
10). Mortality can range from 6 to 15% (
2), and the economic burden of CDI is substantial, with an estimated direct cost of over $6,000 per case in the United States (
37).
Studies of global epidemiology require easily comparable genotyping data for large numbers of bacterial isolates. Genotyping methods in common use for
C. difficile include PCR ribotyping, pulsed-field gel electrophoresis (PFGE), and restriction-endonuclease analysis (REA) (
4,
10,
26,
41). These techniques are generally labor- and resource-intensive, not easily adapted to very high-throughput, and often restricted to reference laboratories. Furthermore, interlaboratory comparison of data can be difficult when they are based on gel banding patterns. Multilocus sequence typing (MLST) is a microbial genotyping method facilitating isolate discrimination using nucleotide sequences of housekeeping gene fragments (
24). Each unique combination of alleles is assigned a sequence type (ST) number. The MLST technique is scalable, according to the question to be addressed or the resources available, and amenable to automation using very-high-throughput robotic systems (
12). Searchable Internet-accessible MLST databases (as at
http://pubmlst.org/ ) allow laboratories performing MLST to maintain ownership of their data. However, having a single laboratory as curator of the database avoids the potential confusion that arises when new allele and ST numbers are assigned (
14). MLST data are also a powerful tool for studying the population biology of bacterial species (
25,
42). An MLST scheme for
C. difficile has been described (
20) but has not been widely adopted. This may be partly because the
ddl locus was a null allele and failed to amplify in certain strains (L. Lemee, personal communication). Furthermore, a curated Internet-accessible database was not available.
MATERIALS AND METHODS
C. difficile stools and culture.
A total of 215 human stool specimens submitted to the Clinical Microbiology Laboratory, John Radcliffe Hospital, Oxford, United Kingdom, between 12 July and 17 October 2008 were included in this study. Stools were from both hospital and community patients. Stools were chosen so that half were sequential, enzyme-linked immunosorbent assay (ELISA) positive (n = 107) with sufficient stool remaining, and half were ELISA negative (n = 108) submitted during the same time period (Premier Toxins A&B Enzyme Immunoassay; Meridian Bioscience Europe, Villa Cortese, Italy). All stool samples underwent culture for C. difficile. Industrial methylated spirits ([IMS] 0.5 ml) was added to a 0.5-ml fecal sample (pea-sized portion if the stool was formed), and the sample was vortex mixed and incubated at room temperature for 1 h. A loopful was then cultured onto modified Brazier's cycloserine-cefoxitin-egg yolk (CCEY) agar (CCEY agar base containing cycloserine-cefoxitin supplement and 5% defibrinated horse blood), and the plates were incubated anaerobically at 37°C for up to 7 days. A single colony was subcultured onto a Columbia blood agar (CBA) plate and incubated for 48 h, after which colonies giving the characteristic odor and fluorescence under UV illumination were obtained. For long-term storage, isolates were emulsified in nutrient broth containing 10% glycerol and stored at −80°C.
An additional 50 isolates were obtained from a collection held at Leeds General Infirmary (reference laboratory for the C. difficile Ribotyping Network for England and Northern Ireland). They represented 45 different PCR ribotypes (plus five duplicates) chosen to represent the overall genetic diversity of C. difficile (determined by PCR ribotyping) and were used to validate the MLST scheme.
Extraction of total stool DNA.
Total DNA was extracted from stool samples using a FastPrep homogenizer (MP Biomedicals Europe, Illkirch, France) to lyse cells and spores, followed by DNA purification using a FastDNA Spin kit for soil (MP Biomedicals). The manufacturer's protocol was followed with the following refinements. Stool samples (100 μl or equivalent volume if the stool was formed) were added to 978 μl of sodium phosphate buffer in impact-resistant 2.0-ml tubes containing matrix E, which comprises 1.4-mm ceramic spheres, 0.1-mm silica spheres, and one 4-mm glass bead in buffer. MT buffer (MP Biomedicals) (122 μl) was added, and stools were homogenized for 40 s at a speed of 6.0 m per s. The lysate was clarified by centrifugation at 13,000 rpm for 15 min. Proteins were precipitated using 250 μl of protein precipitation solution (PPS) and removed by centrifugation for 10 min. The supernatant was mixed with 1 ml of silica binding matrix for 2 min to take up DNA, and then the matrix was allowed to settle for 5 min. The binding matrix was transferred to a SPIN filter (MP Biomedicals) and washed using 500 μl of SEWS-M (salt-ethanol wash solution). After the sample was air dried at room temperature, DNA was eluted from the matrix in 100 μl of DNA elution solution ([DES] DNase and pyrogen-free water).
Preparation of chromosomal DNA from cultured C. difficile isolates.
Isolates were cultured onto CBA and incubated anaerobically for 48 h. A few colonies were emulsified in TE (Tris-EDTA) buffer (Sigma-Aldrich Co., Ltd., Gillingham, United Kingdom) and heated at 100°C for 10 min. Debris was removed by centrifugation at 13,500 rpm for 2 min, and the supernatant was removed for use in MLST. DNA was stored at −20°C.
C. difficile nucleotide sequence alignment and choice of candidate loci for MLST.
Ten publicly available
C. difficile genome sequences, including six of PCR ribotype 027, were aligned using Mauve alignment software (
5). The annotated
C. difficile 630 genome (
36) was included as a reference. This alignment contained several large gaps and was refined using BLAST to position fragments of each genome that were left unaligned by Mauve. Candidate regions for MLST were determined from the refined alignment as follows. Using windows of 500 bp (10 to 60 variable sites), the numbers of variable sites and the numbers of gaps were calculated. MLST loci were chosen such that there was a significant degree of divergence across the 500 bp, and no gaps were present. Fragments were annotated according to their orthologues in
C. difficile 630 to ensure that they spanned housekeeping genes. Candidate fragments were tested
in silico for suitability for primer design. The standard BLASTn search for “short nearly exact matches” was used, and this search is equivalent to BLASTn with the following parameters: word size, 7; low-complexity filter (DUST) off; expect value, 1,000. The database was the entire GenBank nonredundant database (nr). Details are found at:
http://www.ncbi.nlm.nih.gov/blast/producttable.shtml#shortn . PCR was carried out on a subset of
C. difficile isolates to verify amplification efficiency of the new MLST fragment. The seven loci and primers chosen for MLST are shown in Table
1 .
C. difficile high-throughput multilocus sequence typing.
MLST was performed as described below by setting up PCR and sequencing reactions in 24-, 48-, or 96-well plates (Fig.
3 shows the procedure for 24 samples, but it can easily be scaled up to 48 samples). Seven PCR amplicons were obtained for each isolate using the primers shown in Table
1. Each 50-μl PCR mixture contained 39.75 μl of molecular biology-grade water (Sigma-Aldrich Co., Ltd.), 5 μl of 10× PCR buffer (Qiagen Ltd., Crawley, United Kingdom), 1 μl of a 10 μM concentration of each forward and reverse primer, 1 μl of 10 mM deoxynucleoside triphosphate (dNTP) mix (Invitrogen Corp., Paisley, United Kingdom), 0.25 μl of HotStart
Taq DNA polymerase (Qiagen Ltd.), and 2 μl of
C. difficile chromosomal DNA (approximately 10 ng) or extracted total stool DNA. The amplification conditions were 95°C for 15 min, followed by 35 cycles of 94°C for 30 s, 50°C for 40 s, and 72°C for 70 s, with a final extension at 72°C for 5 min and storage at 15°C. The amplification products were purified by precipitation with 20% polyethylene glycol (molecular weight, 8,000) and 2.5 M NaCl, and their nucleotide sequences were determined on each DNA strand using the amplification primers and BigDye Ready Reaction Mix (Applied Biosystems, Warrington, United Kingdom) as follows. Each 10-μl sequencing reaction mixture comprised 2 μl of PCR amplicon, 4 μl of a 1:15 dilution of either forward or reverse PCR primer (0.66 μM), 0.25 μl of BigDye Ready Reaction Mix, 1.875 μl of 5× sequencing buffer (20 ml of stock solution comprised 200 μl of 1 M MgCl
2, 8 ml of 1 M Tris-HCl, pH 9, and 11.8 ml of molecular biology-grade water [all from Sigma-Aldrich Co., Ltd.]), and 1.875 μl of molecular biology-grade water. Dilution of the BigDye Ready Reaction Mix using 5× sequencing buffer reduces the cost of high-throughput sequencing without any compromise in sequence quality. The reaction conditions were 30 cycles of denaturation at 96°C for 10 s, annealing at 50°C for 5 s, and extension at 60°C for 2 min. Unincorporated dye terminators were removed by precipitation of the termination products with 2 volumes of ethanol and 0.1 volume of 3 M sodium acetate (pH 5.2), followed by centrifugation, and the resulting pellet was then washed with 70% ethanol. The reaction products were separated and detected using a 3730 XL DNA analyzer (Applied Biosystems). For each sample, the program STARS (Sequence Typing Analysis Retrieval System [
http://pubmlst.org/software/assembly/ ]) was used to rapidly collate paired reads, determine sequences, and identify alleles. The data for
C. difficile alleles and STs were deposited in a newly developed
C. difficile MLST database, which is accessible at
http://pubmlst.org/cdifficile . Phylogenetic analysis was performed using the program MEGA, version 4 (Molecular Evolutionary Genetics Analysis [
http://www.megasoftware.net/ ]).
Detection of PaLoc genes by PCR.
The oligonucleotide primers used to detect the
tcdA (encoding toxin A),
tcdB (encoding toxin B), and
tcdC (encoding a negative regulator of toxins A and B) sequences found within the pathogenicity locus operon (PaLoc) are summarized in Table
1. The
tcdA assay was published by Lemee et al. (
21) and amplifies a 369-bp amplicon for toxin A-positive B-positive (A
+ B
+) strains and a 110-bp amplicon for A-negative (A
−) B
+ strains, which contain a deletion in the
tcdA gene. The reaction conditions were 95°C for 15 min, followed by 35 cycles of 94°C for 30 s, 52°C for 30 s, and 72°C for 40 s, with a final extension at 72°C for 5 min and storage at 15°C. The
tcdB primers amplify a 688-bp amplicon under the same reaction conditions used for
tcdA, except an annealing temperature of 50°C for 40 s and extension of 72°C for 70 s were used. The
tcdC primers (
33) amplify the 5′ region of the
tcdC gene, giving a 475-bp amplicon under the same conditions used for
tcdA. The absence of the PaLoc was demonstrated using primers lok1 and lok3 (
3), which amplify a 769-bp amplicon in strains without the PaLoc. The reaction conditions were the same as those for
tcdB above.
PCR ribotyping.
All PCR ribotyping of reference isolates and cultured isolates described in the present study was performed at the reference laboratory for the
C. difficile Ribotyping Network for England and Northern Ireland, Leeds General Infirmary. PCR ribotyping was performed as described previously, with modifications (
30). Briefly, bacterial growth was harvested from cultures raised on modified Brazier's CCEY agar with the omission of egg yolk and addition of 5 mg/liter lysozyme (CCEYL) (BioConnections, Wetherby, United Kingdom) for 48 h at 37°C (
44). Template DNA was prepared using a QIAxtractor automated nucleic acid extraction system (Qiagen Ltd). Amplification reactions were performed in 50-μl volumes containing 50 pmol of both forward and reverse primers, 25 μl of HotStart
Taq Plus PCR Master Mix (Qiagen Ltd.), 19 μl of water, and 5 μl of DNA template. The reaction mixtures were activated by heating to 95°C for 5 min and then subjected to 30 cycles of 92°C for 1 min, 55°C for 1 min, and 72°C for 1.5 min. A final cycle of 95°C for 1 min, 55°C for 45 s, and 72°C for 5 min was added. The resultant amplimer was concentrated to a final volume of approximately 20 μl by heating the opened reaction tubes at 75°C for 30 min. Amplification products were subjected to electrophoresis using 3% Metasieve agarose (Flowgen Bioscience, Nottingham, United Kingdom) at a field strength of 7.5 V/cm for approximately 2.5 h. Agarose gels were imaged using a GeneGenius camera system (Syngene, Cambridge, United Kingdom) after ethidium bromide staining. DNA profiles were analyzed and identified against a library of known PCR ribotypes using BioNumerics, version 4.6, software (Applied Maths, Belgium).
ID.
The index of discrimination (ID) for MLST and PCR ribotyping was calculated according to Hunter and Gaston (
11). The ID expresses the average probability that two individuals in the collection will have the same MLST type.
DISCUSSION
MLST is a proven technology for understanding the molecular epidemiology and population biology of bacterial species (
25). Although it has been applied to a diverse collection of
C. difficile isolates (
20), MLST has not been widely adopted for this organism, in contrast to the majority of clinically important bacterial species (
25). Our aim was to further develop MLST for
C. difficile, setting up a more robust method by the following steps: (i) replacing the null allele employed at one of the loci included in the previously published scheme (
20) with an allele present in all strains, (ii) improving discrimination by using longer sequences for MLST, and (iii) establishing an Internet-accessible MLST database to allow straightforward accumulation of data over time and to simplify the comparison of data among laboratories. This MLST scheme for
C. difficile was also sufficiently robust to allow typing to be performed directly on DNA extracted from stool, without culture. This could potentially be used to generate actionable genotyping data close to real time since the entire process can be completed for a batch of 24 isolates in 3.5 days (Fig.
3), at a consumables cost of £15 per stool (or $24.65 as of 29 October 2009) and the cost of one graduate-level member of staff.
The MLST scheme was sufficiently discriminatory to give typing data which can be interpreted with confidence; according to Hunter and Gaston (
11) an ID greater than 0.90 is desirable to meet this requirement. For our 102 clinical isolates, MLST and PCR ribotyping had comparable discriminatory abilities (ID of 0.90 for MLST and of 0.92 for PCR ribotyping). The differences between the methods were generally consistent with a simple genetic explanation; multiple ribotypes for the same ST usually had very similar profiles, and multiple STs for the same ribotype generally had very closely related STs. Capillary gel electrophoresis-based PCR ribotyping is a promising tool to study subtypes within ribotypes, and it may assist the explanation of such observations (
13). They may also be consistent with limited recombination that may be characteristic of
C. difficile.
We calculated the ID as 0.958 for the previously published MLST scheme (34 STs, 62 PCR ribotypes, and 72 isolates) (
20) and as 0.983 for PCR ribotyping for the same collection. However, this is not an entirely robust comparison since all the isolates were specifically chosen for their genetic diversity, and a true ID should reflect the capacity of a typing method to discriminate epidemiologically unrelated isolates within a population.
Pulsed-field gel electrophoresis is another genotyping technique widely used to characterize
C. difficile. The IDs for PFGE, the previously published MLST scheme of Lemee et al. (
20), and PCR ribotyping were found by Killgore et al. (
17) to be 0.843 (PFGE), 0.699 (MLST), and 0.700 (PCR ribotyping) for a collection of 42 isolates from four countries representing epidemic strains and the next most commonly isolated strain types.
Despite the relatively low overall genetic diversity detected within these housekeeping loci, it was still possible to identify four different phylogenetic groups of
C. difficile STs (Fig.
2). The majority of STs clustered in group 1, group 2 contained ST-1 (PCR ribotype 027), group 3 contained two STs associated with PCR ribotype 023, and group 4 contained toxin A
− B
+ ST-37 (PCR ribotype 017). A single outlier, ST-11, was associated with PCR ribotype 078. The previously described MLST scheme for
C. difficile identified three divergent lineages, one containing the A
− B
+ isolates, which corresponds to our group 4 (
20). Stabler et al. (
39) used comparative genomics to identify four clades, and these appear to correlate with the four groups we have identified by MLST in the present study (Fig.
2). In that previous study HA1 (human and animal 1) (
39) contained mainly human isolates with just a few animals, and this clade probably corresponds to our group 1, which contained the majority of our human isolate STs. HA2 probably corresponds to our group 3 as this contained mainly animal isolates (pig and bovine), with few isolates from humans. These data suggest that genotypes clustered by MLST may correlate with groups derived from whole-genome comparisons using DNA microarrays (
39), implying that MLST may be an accurate proxy for whole-genome analysis. The newly emergent ST-11 (PCR ribotype 078) hypervirulent clone was a genetically distinct outlier. This genotype causes infection in humans, pigs, and calves (
6,
16) and has been found in cooked and raw meat products (
38). Multilocus variable-number tandem-repeat analysis (MLVA) data confirmed a strong degree of genetic relatedness between human and animal isolates belonging to this genotype in The Netherlands (
9). ST data presented here suggest that ST-11 (078) has emerged from a single, genetically distinct clade. The other four ST groups may represent different
C. difficile clonal complexes, with the level of nucleotide sequence divergence between STs representing each group ranging from 11/3,501 (0.3%) to 60/3,501 (1.7%).
A robust MLST scheme can now be applied to studies of C. difficile epidemiology and population structure. Direct MLST of C. difficile in stool provides a rapid genotyping method which generates data that are easily compared among laboratories using an Internet-accessible database. It will now be possible to test in a clinical setting the utility of MLST for outbreak identification, detection of transmission events among patients, and the identification of emergent hypervirulent clones, thereby assessing the potential benefits of MLST to individual patients and hospital infection control.