Norovirus (NoV) is one of the leading pathogens of acute viral gastroenteritis worldwide [13] in the last 20 years, causing infections of humans of all ages. Following the introduction of Rotavirus vaccines (Rotarix® and RotaTeq®) into national immunization programs, NoV has become the leading cause of acute gastroenteritis in children in some countries [46]. The dominant strains are of genogroup II, genotype 4 (GII.4), and emerged rapidly with new variants detected every 2 or 3 years, associated with pandemics or prevailing in particular regions [7]. A newly emerged GII.4 isolate was identified in March 2012 during an epidemic in Sydney. The variant spread quickly to North America [8], Europe [9], and other regions [10] in the world. For example, from September to December 2012, in USA, 19–58 % of all cases of acute gastroenteritis were caused by the Sydney GII.4 variant [5]. In an epidemic in Huzhou, China, a Sydney GII.4-like variant was identified, and the sequence of VP1 gene was determined and submitted to Genbank.

In April 2013, a new Sydney 2012-like GII.4 variant was isolated in Jingzhou, Hubei, China from a 2-year-old boy hospitalized with acute gastroenteritis. At the same time GII.3 and GI.2 NoVs were isolated. The complete nucleotide sequences of the three strains were determined and analyzed. Experiment procedures are briefly as follows. Stool samples from inpatient and outpatient with acute gastroenteritis were collected and diluted with 0.01 M PBS, pH 7.2, to make a 20 % of suspension (w/v). The suspension was cleared by low-speed centrifugation and 140 μl of supernatant was used for RNA extraction with QIAamp Viral RNA mini kit following manufacturer’s instructions. Extracted RNA was used for the first strand cDNA synthesis using PrimeScript 1st Strand cDNA Synthesis Kit. The synthesized cDNA was used to screen for the presence of NoV genome by using detection kit for Norwalk-like virus (PCR-Fluorescence Probing method). The mixed degenerate primer sets were used for amplification of the ORF1–ORF2 junction of GI and GII genogroups from positive samples [11]. PCR-amplified fragments were sequenced and then blasted to find the complete genome sequences with closest similarity to Jingzhou isolates for design of sequencing primers. Four primer sets for each of two genotypes were used for sequencing of the whole genome of isolated strains, with the exception of the first 24 nucleotides of the 5′-end of the viral genomes which are derived from primers.

The whole genome of Jingzhou GII.4-2013 strain (Jinzhou 403) (Fig. 1) was cloned and sequenced (Genbank accession number, KF306214), representing the first complete nucleotide sequence of GII.4 Sydney 2012-like Chinese strain. The genome of GII.4 Jingzhou 2013 is 7,559 nucleotides (nt) in length with a polyadenylated tail. OFR1 (5,100 nt), OFR2 (1,623 nt), and OFR3 (807 nt) are overlapped with 20 and 1 nt, respectively. In comparison with the original Sydney strain (Genbank accession number, JX459908.1), there are 54, 24, and 9 differences in nucleotide sequences in ORF1, OFR2, and OFR3, causing 10, 7, and 4 amino acid (aa) changes, respectively (Table 1). Seven mutations leading to aa changes are located at ORF2, coding for the capsid protein VP1, and 4 are within the P2 hypervariable domain. Of the four aa changes located in hypervariable P2 region of GII.4 Jingzhou 2013 strain, residues 373 and 377 are located adjacent to putative epitope C (consisting of aa residues 296 and 372) and epitope B (consisting of aa residues 340 and 376), respectively [12]. Such changes might enable new strains evade human immune responses by changing antibody recognition sited and/or receptor binding sites resulting in varied binding patterns to HBGA receptors [13]. Compared with the VP1 gene of Huzhou 2013 strain, 14 nt differences were identified, resulting in three aa changes (Table 2). The changes may correlate with escape from herd immunity [14]. The new GII.4 variant co-circulated with NoVs GII.3 and GI.2 in Jingzhou. The complete nt sequences of the representative strains of Jingzhou GII.3 and GI.2 were determined (Genbank accession numbers KF306213 and KF306212).

Fig. 1
figure 1

Cloning strategy. Four overlapping fragments covering the whole genome were cloned into the pGEM-T vector. Arrows indicate the orientation of primers used for cloning. The nt positions of individual fragments are shown below each primer

Table 1 Differences in nt and predicted aa of three ORFs between Sydney 2012 and Jingzhou 403 strains
Table 2 Differences in nt and predicted aa of ORF2 between Huzhou128 and Jingzhou403 strains

The phylogenies (Fig. 2) show the evolution of GII.4 lineage of major pandemic isolates [15]. To further clarify the evolution of GII.4 lineages, 76 GII.4 VP1 sequences of NoVs spanning a 40-year period (1974–2013) were submitted to an evolutionary analysis (sequence accession numbers are available upon request). Rates of nucleotide substitution per site per year and the time to the most recent common ancestor were estimated by using the Bayesian Markov Chain Monte Carlo algorithm as implemented in the BEAST software package [16]. The rate of evolution, based on the most conservative molecular clock model (strict clock) was 4.80 × 10−3 nucleotide/site/year (4.33–5.28 × 10−3) and the mean age of the population was 40 years. Based on above evolutionary analysis, a maximum clade credibility tree was inferred (Fig. 3), showing CHDC GII.4 sequences from the 1970s forming a cluster that likely evolved into the present Sydney 2012 cluster.

Fig. 2
figure 2

Phylogenetic analysis of dominant GII.4 strains circulating during 1974–2013. Phylogenetic tree was constructed based on the complete sequences of ORF2 of NoVs. The evolutionary distances were computed using the Maximum Composite Likelihood method, and the tree was plotted by using the neighbor-joining method. Numbers at each branch point are bootstrap values for that branch supported cluster. The newly identified strain Jingzhou 403 was marked with a triangle. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. Strains and Genbank accession No. are as follows: Huzhou128, KC473546.1; CHDC 1970s (1974–1977), FJ537134; Lordsdale (1987–1993), X86557; Camberwell (1987–1994), AF145896; US1995/96 (1995–2002), AY741811; Henry 2001 (2000–2004), EU310927; Japan 2001 (2002–2003), AB294779; Farmington Hills 2002 (2002–2004), AY502023; Asia 2003 (2003–2006), DQ369797; Hunter 2004 (2004–2006), DQ078814; 2006a (2006–2008), EF187497; 2006b (2006–2012), EF684915; Osaka 2007 (2007–2008), AB541319; Cairo 2007 (2007), GQ845368; Apeldoorn 2008 (2007–2011), HQ009513; New Orleans 2009 (2008–2012), GQ845367; Sydney 2012 (2012), JX459908

Fig. 3
figure 3

Maximum clade credibility tree of GII.4 NoVs VP1 sequence dataset (76 sequences). Each internal node is labeled with the posterior probability monophyly of the corresponding clade. GII.4 sequences from the CHDC1974 cluster and Sydney 2012 cluster are indicated. Branch length was calibrated to reflect temporal patterns

The result explain the rapid spread of this GII.4 variant to China and the rest of the world, and rapid molecular evolution of NoVs, attributing to mutations in response to herd immunity. The impact of amino acid changes in VP1 of variants on the antigenicity or immunogenicity needs to be investigated as multiple reports have concluded that the major capsid protein of GII.4 are evolving rapidly [17, 18]. Understanding of the relationship among antigenic variations in the major capsid protein of pandemic strains will certainly shed new light on the development of effective vaccines [19] and therapeutic monoclonal antibodies [20]. Considering the large population in China and NoVs becoming the second leading pathogen of non-bacterial acute gastroenteritis [21, 22], further spreading and evolution of NoVs need to be closely monitored in China.