Background
Human papillomaviruses (HPVs) belong to family Papillomaviridae, with small, circular, double-stranded DNA genomes [
1]. Papillomaviruses are highly diverse and infect human epithelial cells, where some species are implicated in the development of several benign and malignant neoplasms [
1,
2]. Based on the nucleotide similarity in the L1 open reading frame (ORF), HPVs are classified into genera, species and types [
1,
3]. HPV types that display less than 60 % nucleotide sequence similarity in the L1 open reading frame (ORF) are classified into different genera, while viral species share 60–70 % nucleotide sequence similarity, and types show at least 90 % [
1]. The taxonomy of papillomaviruses at the species level and above is overseen by the International Committee for the Taxonomy of Viruses; while, HPV type designation and classification below the species level is maintained by the International HPV Reference Center at the Karolinska Institute, Stockholm, Sweden [
4]. As of May 6, 2016, two hundred and one different HPV types, ranging from HPV-1 to HPV-205, were officially recognized, including 65 Alphapapillomaviruses (Alpha-PVs), 51 Betapapillomaviruses (Beta-PVs), 81 Gammapapillomaviruses (Gamma-PVs), 3 Mupapillomaviruses (Mu-PVs), and a single Nupapillomavirus (Nu-PV) [
5]. Four previously recognized HPV types (HPV-46, HPV-55, HPV-64 and HPV-79) were recently re-classified as subtypes (HPV-79 was replaced by HPV91) [
6‐
8].
As more HPV types are discovered, it is necessary to confirm the genomic characteristics and the clinical relevance of these new HPV types. Although a variety of laboratory techniques for detecting known HPVs are available, an efficient laboratory method for the identification of divergent HPVs from different biological samples is required.
Viral metagenomics is a useful method that can non-specifically detect both already known and highly divergent viruses. Non capsid protected host-derived background DNA and RNA in biological samples can be digested by nuclease treatment and nucleotide sequence information from viral particles associated nucleic acids determined by sequence-independent amplification and deep sequencing to allows the identification of viruses recognizable through their nucleotide or translated protein sequence homologies to all known viruses. With the advent of viral metagenomics [
9‐
12] and the use of rolling circle amplification (RCA) which preferentially amplifies DNA circles [
13,
14], it becomes possible to obtain the complete genome of new HPVs at relatively high speed.
In the present study, a viral metagenomic method was applied to vaginal swab samples collected from pregnant women, and identified a novel gamma papillomavirus in a sample from a 25-year-old pregnant woman suffering from vaginitis. We describe here the genome and phylogenetic analysis of this novel Gamma-PV type. Type-specific PCR primers were also developed to test a representative collection of various clinical specimens.
Discussion
The advent of metagenomics using high-throughput sequencing technology [
9‐
12] facilitates the discovery of novel viruses at a relatively low cost and unprecedented speed, assists in the study of viral diversity, and the association of virus and disease [
28‐
30]. We report here the genome of HPV-ZJ01, a novel human gammapapillomavirus type identified from a vaginal swab sample of a pregnant woman, whose L1 gene shares 71.1 % nucleotide identity with that of HPV162, its closest relative. Phylogenetic analysis confirmed that HPV-ZJ01 clusters with other Gammapapillomaviruses in species 19 (HPV161, HPV162 and HPV166), all of them containing the typical HPV ORFs, coding for four putative early proteins (E6, E7, E1, E2), and two late capsid proteins (L2 and L1). Due to the presence of the characteristic donor (AAG/GUASNR) and acceptor (GUYACYAG/YU) RNA splicing sites in E1 and E2 ORFs, respectively [
13,
24‐
26], from HPV-ZJ01, HPV161, HPV162 and HPV166, it is most likely that the E4 proteins of these four HPVs are translated from a spliced mRNA, containing the first few codons of the E1 ORFs joined to the E4 sequence. The resulting putative E1^E4 fusion proteins contain a typically high content of proline, which represents 11 %, 13.27 %, 10 % and 10 % of E1^E4 amino acid sequences of HPV-ZJ01, HPV161, HPV162, and HPV166, respectively.
Analysis of genome organization of HPV-ZJ01, HPV161, HPV162 and HPV166 revealed that they share many similarities and some differences. Two conserved zinc-binding domains of CxxC(x)29CxxC separated by 36 amino acids are found at the same position away from the N-terminal end of E6 protein of these four HPVs, respectively. Additionally, the E7 proteins of them also contain a single zinc-binding domain of CxxC(x)29CxxC at the same position, respectively. The putative LXCXE motif found in the E7 protein of HPV-ZJ01, also identified in the three HPVs of Gamma genus, species 19 is a canonical pRB-binding motif and has been implicated in the transformation of host cells [
21,
31,
32]. A homologous LXCXE motif is used to bind to the pocket region of pRB, P107 and P130 and prevent inter-actions with the transcription factor E2F-1 by most HPV E7 proteins [
33]. The E1 of HPV-ZJ01 contains the conserved ATP-binding site of the ATP-dependent helicase (GPPDSGKS), nevertheless, in comparison, the motif is GPPDTGKS in all of the other three HPVs, with the Serine being replaced by Threonine. The putative E2 protein of HPV-ZJ01 contains no conserved leucine-zipper domain (L-X6-L-X6-L-X6-L), necessary for E2 dimerization. This domain is also missing in HPV161, 162 and 166. In the LCR of both HPV-ZJ01 and HPV161, a putative TATA box (TATAAA) of the E6 promoter is identified 36 and 34 nt upstream of the first start codon of the E6 ORF, respectively, while the classical TATA box (TATAAA) was not detected in the LCR of HPV162 and HPV166. Similarly, three and two E1 binding sites (AACAAT) are found in the LCR of HPV-ZJ01 and HPV161, respectively, while none is detected in the LCR of HPV162 and 166. As for the polyadenylation sites, the LCR of HPV-ZJ01 and HPV162 each contain one and HPV161 and HPV166 each contain two. Notably, for HPV161, its LCR simply contains two copies of the 12-basepair palindrome motif ACCG-N4-CGGT, known to play a crucial role in binding of the E2 protein [
18,
19]. However, in the other three HPVs (HPV-ZJ01, HPV162, HPV166), both palindrome sites (ACCG-N4-CGGT) and degenerate palindrome sites (ACC-N6/7-GGT) are simultaneously present in their LCRs. HPV-ZJ01 contains two palindrome sites (ACCG-N4-CGGT) and three degenerate palindrome sites (ACC-N6/7-GGT) in its LCR, while the LCR of HPV162 only contains one palindrome site (ACCG-N4-CGGT), and the other three E2 binding sites are represented by the degenerate ones (ACC-N6-GGT). Only one palindrome site (ACCG-N4-CGGT) and four degenerate palindrome sites (ACC-N6-GGT) are in the LCR of HPV166.
HPV types have traditionally been classified as mucosal or cutaneous on the assumption that HPV tissue tropism is reflected in the location where the genome of the particular type was found [
13,
34]. Accordingly, HPV types belonging to the Alpha-PVs are predominantly assigned as anogenital types and Beta-PVs, Gamma-PVs, Mu-PVs and Nu-PVs as cutaneous types [
2,
34]. However, a growing body of evidence suggests that HPV types belonging to Gamma-PVs (the genus whose known members has recently been expanding most rapidly) are ubiquitous and show broader tissue tropism than previously thought [
13], with reported detection sites ranging from healthy skin and various cutaneous lesions [
35‐
38] to genital lesions [
39] and oral [
34] and nasal [
40] mucosa. One recent report suggests that Gamma-PVs are also able to infect mucocutaneous sites such as the anal canal [
41] and even the possibility of dual tropism of Gamma HPVs was recently proposed [
26]. Here, HPV-ZJ01, confirmed to be a member of Gamma-PV genus, species 19 and identified in a vaginal swab can be preliminarily classified as mucosal. However, as its closest relatives (HPV161, HPV162 and HPV166) were isolated from healthy skin [
27], the tropism of HPV-ZJ01 remains to be definitely established as different PV types classified in a specific species typically share biological and pathological properties [
3]. The 25-year-old pregnant woman in whom we identified HPV-ZJ01 was suffering from vaginitis. Whether the presence of HPV-ZJ01 was at all related to this condition or indirectly facilitated viral replication is not known. No skin or mucosal lesion was noted by the attending gynecologist. In humans, cutaneous HPVs are highly prevalent in the general population, where they usually are contained by the immune system and result in subclinical infections. These infections however tend to become clinically apparent upon genetic, acquired, or iatrogenic deficiencies in cell-mediated immunity, and cutaneous HPVs have been linked to non-melanoma skin cancer in these patients [
42].
Here we screened total 216 samples including vaginal swabs, cervical carcinoma tissues and cervical smears from cervical carcinoma patients using PCR methods. The negative PCR results suggested that HPV-ZJ01 may not be associated with cervical cancer. Further studies with different sample types and a larger sample size are needed to clarify the prevalence and disease association, if any, for HPV-ZJ01.
Abbreviations
Alpha-PVs, Alphapapillomavirus; Beta-PVs, Betapapillomavirus; DNase, deoxyribonuclease; Gamma-PVs, Gammapapillomavirus; HPV, human papillomavirus; ICTV, International Committee on Taxonomy of Viruses; LCR, non-coding long control region; Mu-PVs, Mupapillomavirus; NCBI, National Center for Biotechnology Information; nt, nucleotide; Nu-PVs, Nupapillomavirus; ORF, open reading frame; PBS, phosphate-buffered saline; PCR, polymerase chain reaction; PV, papillomavirus; RCA, rolling circle amplification; RNase, Ribonuclease; URR, upstream regulatory region.