The quality of the 101-base paired-end reads was confirmed using FastQC (
http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc). High quality reads were assembled with “Spades” software (v.3.6.2) [
5], using the options ‘–careful’ to reduce the number of mis-assemblies and ‘–cov-cutoff auto’ to remove the potentially mis-assembled low coverage contigs. Annotation was performed using the RAST server [
6]. The assembled annotated files are in Genbank. Nucleotide variation was identified compared to
V. cholerae O1 El Tor strain TEM/25/01-004 (whole genome sequence tag TANZ_56) in order to avoid spurious single nucleotides variants (SNVs). PARSNP (v1.2) [
7] was used to extract and align the variable nucleotides from the core-genome, using the parameter ‘–c’ to constrain the use of all input genomes and generate the ‘.vcf’ variant description file and ‘.ggr’ alignment description file. The ‘.ggr’ file was loaded in Gingr (v1.2) [
7] to visualize the alignments and export the variant nucleotide alignment ‘.mfa’ file. The ‘.vcf’ file was then used to remove all variants from the ‘.mfa’ file detected in the edge of the contigs (less than 1 kb of the contigs edges) using an in-house script. FastTree2 (v2.1.9) [
8] was used with the default parameters to generate the maximum-likelihood newick tree file using the corrected ‘.mfa’ alignment file. Then iTOL (
http://itol.embl.de/) [
9] was used to visualize the maximum-likelihood tree. In order to place the isolates into the 7th pandemic phylogeny, we mapped the reads to the
Vibrio cholerae 01 El Tor reference N16961 using SMALT (
http://www.sanger.ac.uk/resources/software/smalt) [
10]. The alignment was then striped of putative recombinant sites via Gubbins [
11]. The resulting alignment of 5020 nucleotides was used to infer the phylogenic tree using RAxML [
12] under the GTR model with 100 bootstrap replicates. The pre-seventh pandemic strain M66 (NCBI accession numbers CP001233 and CP001234) was used as outgroup. The whole genome sequences from previous publications are listed in Additional file
1: Table S1.