A detailed comparative analysis on the overall codon usage patterns in West Nile virus

https://doi.org/10.1016/j.meegid.2013.01.001Get rights and content

Abstract

West Nile virus (WNV) is a member of the family Flaviviridae and its genome consists of an 11-kb single-stranded, positive-sense RNA. WNV is maintained in an enzootic cycle between mosquitoes and birds, but can also infect and cause disease in horses and humans, which serve as incidental dead-end hosts. Understanding the extent and causes of biases in codon usage is essential to the comprehension of viral evolution. In this study, we performed a comprehensive analysis of 449 WNV strains, for which complete genome sequences are available. Effective number of codons (ENC) indicates that the overall codon usage among WNV strains is only slightly biased. Codon adaptation index (CAI) values found for WNV genes are different from the CAI values found for human genes. The relative synonymous codon usage among WNV strains isolated from birds, equines, humans and mosquitoes are roughly similar and are influenced by the relative dinucleotide frequencies. Taking together, the results of this work suggest that WNV genomic biases are the result of the evolution of genome composition, the need to escape the antiviral cell responses and a dynamic process of mutation and selection to re-adapt its codon usage to different environments.

Highlights

• We performed a comprehensive codon usage analysis of 449 West Nile virus strains. • The overall codon usage among West Nile virus strains is only slightly biased. • Codon adaptation index values obtained for West Nile virus genes are different from the ones obtained for human genes. • These differences seem to be due to codon preferences in their codon usage. • Codon usage in West Nile virus is influenced by the relative dinucleotide frequencies.

Introduction

West Nile virus (WNV) is a member of the family Flaviviridae and belongs to the genus Flavivirus, which consists of more than 70 species. Among these are several arthropod-transmitted viruses with clinical importance, most prominently dengue virus (DENV), Yellow fever virus (YFV), Tick-borne encephalitis virus (TBEV) and Japanese encephalitis virus (JEV). Flaviviruses cause severe health problems in nearly all parts of the world. Within flaviviruses, WNV is classified into the in JEV serogroup, which includes also Murray Valley encephalitis virus (MVEV), St. Louis encephalitis virus (SLEV) and Usutu virus (USUV) (Calisher et al., 1989, May et al., 2011, Ulbert, 2011). Flaviviruses have an 11-kb single-stranded, positive-sense RNA genome which is translated into a single polyprotein upon infection of the host cell. This polypeptide is enzymatically processed by both viral and host cell proteases, yielding the three structural proteins (C, prM and E) and seven non-structural proteins (NS1, 2A, 2B, 3, 4A, 4B and 5) (Roosendaal et al., 2006). WNV is maintained in an enzootic cycle between mosquitoes and birds, but can also infect and cause disease in horses and humans, which serve as incidental dead-end hosts (Pesko and Ebel, 2012, Lim et al., 2011). WNV is endemic in some regions of Africa, Europe, the Middle East and Asia (Dauphin et al., 2004). As with other vector-borne diseases, the warmer temperature in the tropics facilitates longer transmission seasons and sometimes increased transmission intensity through faster mosquito and virus development and increased biting rates. Following its emergence in the United States in 1999, it has rapidly spread across North America, and has been recently reported in South America and the Caribbean (Komar and Clark, 2006). WNV is currently the most widely distributed of the encephalitic flaviviruses (May et al., 2011).

Compared to most mosquito-borne viruses, WNV has an enormous vector and host range. More than 300 avian species are susceptible and many of these develop high viral serum titers during the acute phase of infection, although members of the family Passeriformes (particularly Passer domesticus, Turdus migratorius, Sturnus vulgaris, Cyanocitta cristata and Carpodacus mexicanus) are presumed to be the most important avian hosts in both Europe and the Americas. On the other hand, most isolations from mosquitoes come from Culex species (particularly Culex pipiens and Culex pipiens quinquefasciatus) (Hamer et al., 2009, Komar et al., 2003).

The redundancy of the genetic code, in which most of the amino acids can be translated by more than one codon, offers evolution the opportunity to tune the efficiency and accuracy of protein production to various levels while maintaining the same amino-acid sequence (Stoletzki and Eyre-Walker, 2007). The various codons that correspond to the same amino acid are often considered ‘synonymous,’ yet their corresponding tRNAs might differ in their amounts in cells and thus also in the speed in which they will be recognized by the ribosome. The alternative nucleotide sequences of the various codon choices for a protein might give rise to transcripts with different secondary structure and stability, which may affect translation and even folding (Ikemura, 1982). The number of alternative nucleotide sequences that could still code for the same protein is astronomical, leaving many degrees of freedom that evolution could use for achieving control without affecting the protein sequence. While the non-random usage of synonymous codons is often correctly assumed to reflect the action of neutral drift, in an increasing number of cases it now turns out to reflect the result of natural selection, perhaps mainly for tuning efficiency and accuracy of translation (Gingold and Pilpel., 2011).

The differential usage of the synonymous codons (among other aspects of genome evolution) might be important for the comprehension of the viral biology, in particular, the interplay between viruses and the immune response (Shackelton et al., 2006). Indeed, as is well known, synonymous triplets are generally not used randomly, and the main forces that drive this bias from equal usage are natural selection (which is related to translation efficiency at two different levels: speed and accuracy) and mutational biases. In spite of recent efforts to understand codon usage biases in viruses (Liu et al., 2011, Liu et al., 2012, Zhou et al., 2012, Pandit and Sinha, 2011, D’Andrea et al., 2011), more studies are needed in order to address the evolutionary forces that influence the observed patterns. Because of the comparative small genome size and some other features generally associated with viruses (for example recursive bootlenecks), very probably they are submitted to different constraints in relation to prokaryotic and eukaryotic genomes. Since their replication and protein synthesis depends on the host’s machinery, the interplay of codon usage in the virus and the host is expected to affect viral fitness, survival and evolution. For these reasons, a detailed understanding of the evolution of WNV codon usage in relation to the host is much needed.

In order to gain insight into these matters, we performed comprehensive analyses of codon usage and composition of 449 WNV strains, which represent all the complete genome sequences available in the databases, and investigated the possible key evolutionary determinants of the biases found.

Section snippets

Sequences

Complete genome sequences of 449 WNV strains were obtained from DDBJ database (available at: http://arsa.ddbj.nig.ac.jp/). For strain names and accession numbers see Supplementary Material Table 1. The data set comprised a total of 1541,388 codons.

Data analyses

Codon usage, dinucleotide frequencies, base composition and the relative synonymous codon usage (RSCU) (Sharp and Li, 1986) were calculated using the program CodonW (written by John Peden and available at http://sourceforge.net/projects/codonw/).

Codon usage variation among WNV strains

In order to investigate if these WNV strains display similar codon usage biases, the ENC’ values were calculated for the 449 strains enrolled in this study. A mean value of ENC of 53.81 ± 0.11 was obtained. This suggests that the overall codon usage among these strains is similar and only slightly biased.

In order to gain insight into these matters, a COA was performed on the RSCU values for all WNV strains enrolled in these studies. The first axis generated by the analysis accounts for 27.7% of

Conclusions

The results of these studies revealed different codon preferences in WNV genes in relation to codon usage of human genes (see Table 2). We show that codon usage bias in this virus is relatively low. This is in agreement with previous results found from other RNA viruses such H5N1 Influenza A Virus (Ahn et al., 2006, Zhou et al., 2005); SARS (Zhao et al., 2008); FMDV (Zhong et al., 2007); classical swine fever virus (Tao et al., 2009); Duck Enteritis virus (Jia et al., 2009) or Theilovirus (Liu

Acknowledgements

We acknowledge support by International Atomic Energy Agency, through Research Contract no. 15792. Authors acknowledge support by Agencia Nacional de Investigación e Innovación (ANII) through project PE_ALI_2009_1_1603 and PEDECIBA, Uruguay. We also thank the support of Fondo Clemente Estable, 2007_ 722 to HM. We thank anonymous reviewers for important insights into the improvement of the quality of this work.

References (47)

  • S. Zhao

    Analysis of synonymous codon usage in 11 Human Bocavirus isolates

    Biosystems

    (2008)
  • T. Zhou

    Analysis of synonymous codon usage in H5N1 virus and other Influenza A viruses

    Biosystems

    (2005)
  • I. Ahn

    Genomic analysis of Influenza a viruses, including Avian Flu (H5N1) strains

    Eur. J. Epidemiol.

    (2006)
  • D.E. Brackney

    West Nile virus genetic diversity is maintained during transmission by Culex pipiens quinquefasciatus mosquitoes

    PLoS ONE

    (2011)
  • C.H. Calisher

    Antigenic relationships between flaviviruses as determined by cross neutralization tests with polyclonal antisera

    J. Gen. Virol.

    (1989)
  • A.T. Ciota

    Role of the mutant spectrum in adaptation and replication of West Nile virus

    J. Gen. Virol.

    (2007)
  • E.R. Dearforff et al.

    West Nile virus experimental evolution in vivo and the trade-off hypothesis

    PLoS Pathog.

    (2011)
  • E. Domingo

    Quasispecies and RNA virus evolution: principles and consequences

    (2001)
  • A. Dorn et al.

    Clinical application of CpG-, non-CpG, and antisense oligodeoxynucleotides as immunomodulators

    Curr. Opin. Mol. Ther.

    (2008)
  • H. Gingold et al.

    Determinants of translation efficiency and accuracy

    Mol. Syst. Biol.

    (2011)
  • M. Greenacre

    Theory and Applications of Correspondence Analysis

    (1984)
  • G.L. Hamer

    Host selection by Culex pipiens mosquitoes and West Nile Virus amplification

    Am. J. Trop. Med. Hyg.

    (2009)
  • T. Ikemura

    Correlation between the abundance of yeast transfer RNAs and the occurrence of the respective codons in protein genes. Differences in synonymous codon choice patterns of yeast and Escherichia coli with reference to the abundance of isoaccepting transfer RNAs

    J. Mol. Biol.

    (1982)
  • Cited by (61)

    • Analysis of codon usage of Horseshoe Bat Hepatitis B virus and its host

      2021, Virology
      Citation Excerpt :

      Codon usage analysis helps in understanding evolutionary function (Barbhuiya et al., 2019), including prediction of gene expression levels, gene functions, primer designing (Uddin et al., 2019b), and delineation of more frequently and less frequently used codons in protein expression and the recognition of coding genes (Barbhuiya et al., 2019). The reciprocity of the codon usage pattern of virus with its host might influence the overall viral competency, survival and host selection pressure (Moratorio et al., 2013; Shackelton et al., 2006). Thus, the study of viral codon usage can elucidate their molecular evolution, co-evolutionary strategies, understanding of viral gene expression and designing vaccine.

    • Analysis of codon usage patterns and influencing factors in rice tungro bacilliform virus

      2021, Infection, Genetics and Evolution
      Citation Excerpt :

      The results showed that the mean A (41.61%) was the highest, followed by U (25.41%), G (17.63%) and C (15.36%) across all genomes. This result was consistent with the prior studies wherein A and U frequencies were higher than C and G frequencies including dengue virus, West Nile virus, Japanese encephalitis virus, yellow fever virus, and hepatitis C virus (Kattoor et al. 2015; Lara-Ramírez et al. 2014; Moratorio et al. 2013; van Hemert and Berkhout 2016). However, the biological causes for increased A and decreased G are unknown so it is important to determine the causes of these trends in viral RNA genomes (van Hemert et al. 2016).

    View all citing articles on Scopus
    1

    These authors contributed equally to this work.

    View full text