Background
In recent years, countries across the Americas have experienced the emergence and endemic circulation of various mosquito-borne viruses, making this a critical area for public health surveillance and epidemiologic research. Zika virus (ZIKV) caused a particularly widespread epidemic, with over 800,000 suspected or confirmed cases reported [
1]. Given estimated seroprevalence rates between 36% and 76% [
2‐
5], the true number of ZIKV infections in the Americas is likely much higher. With neither a vaccine nor ZIKV-specific treatments available, understanding the epidemiology of ZIKV is our primary tool for controlling disease spread [
6]. However, because many infections are asymptomatic [
2], the analysis of surveillance data alone yields inaccurate estimates of when ZIKV arrived in a country [
7,
8]. In such cases, introduction timing and transmission dynamics post-introduction are better inferred from genomic epidemiological studies, which use joint analysis of viral genome sequences and epidemiologic case data. Indeed, such studies have defined our understanding of when ZIKV arrived in Brazil [
7], described general patterns of spread from Brazil into other countries in the Americas [
7,
9,
10], and been used to investigate the extent of endemic transmission occurring post-introduction [
8]. Genomic epidemiological studies of the spread of ZIKV in the Americas have aided our understanding of the epidemic [
7‐
19], but generally, ZIKV pathogen sequencing has remained a challenge for the public health community [
20].
Colombia has a population of approximately 48 million people. In addition, Colombia has
Aedes aegypti and
Ae. albopictus mosquitoes, which are commonly found at elevations below 2000m above sea level [
21]. Public health surveillance for arboviral diseases, along with other notifiable conditions, is performed by the Instituto Nacional de Salud de Colombia (INS) [
21]. While suspected cases from other municipalities were reported earlier [
22], the INS first confirmed ZIKV circulation in mid-September 2015, in the Turbaco municipality on the Caribbean coast. ZIKV spread throughout the country, appearing in areas infested with
Ae. aegypti that experience endemic dengue transmission and previous circulation of chikungunya virus [
23]. Over the entire epidemic Colombia reported 109,265 cases of Zika virus disease [
24], making it the second most ZIKV-affected country in the Americas after Brazil. The extent of the epidemic led the INS to start active surveillance for congenital Zika syndrome [
25] as well as other neurological syndromes associated with ZIKV infection [
26]. While the INS determined that epidemic ZIKV transmission ended in July 2016, they continue to perform surveillance for endemic transmission.
Despite numerous reported cases, only 12 whole ZIKV genomes from Colombian clinical samples were publicly available. These sequences included 1 sample from Barranquilla, Atlántico department, 4 samples from Santander department, and 7 sequences for which departmental or municipal information was unspecified. We sequenced an additional 8 samples from ZIKV-positive human clinical and Ae. aegypti specimens, sampled from previously unrepresented Colombian departments. We describe here the first detailed phylogeographic analysis of Colombian ZIKV to estimate when, and how frequently, ZIKV was introduced into Colombia.
Discussion
Despite the scale of the Colombian epidemic, publicly available sequence data were limited, and no detailed genomic epidemiological analysis of ZIKV dynamics had been performed. We sought to improve genomic sampling for Colombia, and to perform a detailed genomic analysis of the Colombian epidemic. Only 12 Colombian genomes were available prior to this study. To these data we added 8 new sequences sampled broadly across Colombia, and performed a phylogeographic analysis of American ZIKV. We describe general transmission patterns across the Americas and present estimates of ZIKV introduction timing and frequency specific to Colombia. We find evidence of at least two introductions of ZIKV to Colombia, yet remarkably the majority of Colombian viruses cluster within a single clade, indicating that a single introduction event caused the majority of ZIKV cases in Colombia. Under the assumption that viruses sequenced from Colombia are random samples of Colombian ZIKV cases, we find that Colombian ZIKV diversity is well represented by 20 Colombian genomes. It is therefore unlikely that further genomic sampling would reveal more introductions of ZIKV into Colombia. ZIKV dispersal out of Colombia appears widespread, with movement to bordering countries (Panama, Venezuela, Ecuador, and Peru) as well as more distal countries in the Caribbean.
While it may be tempting to read the inferred phylogeographic migration history as a complete record of transmission between countries, we caution against doing this for analyses of ZIKV. In contrast to other large outbreaks, such as the Ebola epidemic in West Africa, genomic sampling of the American ZIKV epidemic is sparse. Many ZIKV-affected countries have minimal genomic sampling; others have none at all. Thus while the phylogeographic model will correctly infer the geographic location of internal nodes given the dataset at hand, adding sequences from previously unsampled countries may alter migration histories such that apparent direct transmission from country A to country C instead becomes transmission from country A to country B to country C.
Consistent with other studies, our estimates of when introductions to Colombia occurred support cryptic ZIKV transmission prior to initial case confirmation. Perhaps more surprisingly, our estimate of the age of clade 1 indicates that ZIKV likely spread to Colombia even before official confirmation of ZIKV circulation in Brazil [
37]. These findings underscore the utility of genomic epidemiology to date introduction events and describe transmission patterns that are difficult to detect using traditional surveillance methods, thereby providing more accurate definitions of the population at risk and a better understanding of how importation and within-country transmission shape epidemics.
Conclusions
We found evidence for two separate introductions of ZIKV to Colombia, one of which occurred five to eight months prior to the official confirmation of ZIKV in Colombia. Refining our estimates of when ZIKV circulated in Colombia improves our definition of when individuals were at risk for ZIKV infection. Accurately defining this exposure period increases the ability of population-level observational studies to properly measure associations between ZIKV infection and outcomes of interest. In addition, we found that the majority of ZIKV diversity in Colombia descends from one of these two introductions, and rarefaction analyses suggest that we would not identify more introductions with greater genomic sampling. Taken together, these findings suggest that most cases of ZIKV infection were attributable to ZIKV transmission within Colombia after a single introduction event, and that cases of ZIKV infection acquired in other countries and brought back to Colombia were rare. As the majority of Colombian ZIKV infections were locally-acquired, infection prevention and control measures targeting local spread might have limited the scale of the outbreak within Colombia.
Acknowledgements
We thank R. Shabman, B. Pickett, P. Rahal, L. Karan, R. Delgado, A. Enfissi, N. Grubaugh, and R. Lanciotti for giving us permission to include unpublished genomes available on GenBank in our analysis. We also thank Maria Fernanda Torres for facilitating this collaboration and Adam Geballe for generously loaning space in his laboratory at the Fred Hutch.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.