Background
Cervical cancer (CC) is the most frequent female cancer worldwide, occupying fourth place among cancers in women. In Mexico, the incidence of this pathology is high (13,960 new cases each year), reaching second place for cancers in women and causing 4769 deaths each year according to Globocan 2012 [
1]. Infection with HPV is the principal etiological factor involved in the development of CC [
2]. To date, >202 HPV genotypes have been characterized [
3] and >40 have been reported to infect the female genital tract [
4,
5]. Globally, the HPV genotypes mainly associated with CC include HPV-16, 18, and 45 [
6], but these data vary depending on the geographic area. For example, while HPV-35 and 45 are more prevalent in Africa, the genotypes 52 and 58 are more frequently observed in Asia, whereas 33 and 31 are more prevalent in Europe and the USA [
7‐
9]. Regarding Cervical intraepithelial neoplasia grade 1 (CIN I), HPV-42 and 84 were reported as more prevalent in Sweden [
10], while in Chile, HPV-61 and 89 [
11] are most prevalent, and in Asia, HPV-52 [
12] and in the USA (in rural American Indian women), HPV-61 [
13]. In Mexico, a recent study of our research group, using genotyping by Linear Array, show that the HPV genotypes most frequently found in cervical cancers were 16 (62.8 %), 18 (11.6 %), 45 (8.3 %), 52/58 (6.6 %), and 39 (5.8 %). Notably, a high percentage (58–64 %) of coinfections were found in CIN I—CIN III, suggesting that coinfections could contribute to regression or progression of the cervical lesions [
14].
All of these data suggest the need to develop detection and prevention strategies consistent with the characteristics of the geographic population.
There are few commercial methods available for HPV genotyping [
15], the majority of these are based on PCR and reverse hybridization. However, this approach entertains the drawback of only recognizing HPV genotypes with sufficient dissimilarity for discrimination by probes used in hybridization, allowing the detection of a restricted number of HPV genotypes and not discriminating among nucleotide variants. Next-Generation Sequencing (NGS), or Massive parallel sequencing (MPS), has emerged as a powerful tool for genotyping, possessing the advantage of analyzing many samples at the same time. Recently, this methodology has been used for the genotyping of HPV [
16‐
18]. In Mexico, there are few reports of HPV genotyping by means of LA and none, to our knowledge, employing NGS. In this work, we identified, by LA and NGS, the HPV genotypes infecting the cervixes of woman from Western Mexico.
Discussion
Cervical cancer remains a public health problem in Mexico, despite the prevention and vaccination campaigns implemented by the government. Vaccination is expected to decrease the incidence of this malignancy in the near future, and this probably would impact differentially, depending on the geographic region, because it has been shown that HPV genotype frequency varies among the different countries [
19‐
22]. Thus, there are local efforts to develop population-specific directed HPV vaccines; however, to develop a vaccine specific for a geographic population, it is essential to first know the HPV genotypes circulating in the population. In this study, we detected the HPV genotypes that are infecting women from Western Mexico, not only by LA, which is the commercial test with the major number of genotypes included, but also by NGS, to identify the widest possible genotypes. Results obtained by LA were very similar to those previously reported and discussed by our IMSS Research Network on HPV [
14]; however, the use of 454 NGS allowed us to identify previously not reported HPV genotypes in Mexican population. NGS is a technology that has been efficiently used for HPV genotyping; this methodology is based on Massive Sequencing (MS) of HPV fragments from a great number of genotypes, even when these are in low in copy number [
16,
18].
Using 454 NGS, we could detect genotypes not included in LA: 32, 44, 74, 102, and 114 (1, 79, 2, 2, and 30 reads, respectively). We found HPV-32 only in the WCL group; this genotype has been associated with oral focal epithelial hyperplasia [
23,
24] and oral papillomatous lesions from patients with Human immunodeficiency virus (HIV) infection [
25]. In another study, in a search for uncommon and rare HPV genotypes in cervical carcinomas, HPV-32 was not found [
26]. On the other hand, characterization of the properties of HPV-32 E6 and E7 oncoproteins have provided evidences of the benign nature of this genotype [
27].
With respect to HPV-44, we detected this genotype in WCL or CIN I samples, but not in CC. This HPV was isolated for the first time from a vulvar condyloma, and was found in five cases of 439 normal squamous epithelium samples and in three cases of 112 CIN I samples, but was not found in any of the 56 CC analyzed [
28]. However, this genotype has been found in patients with invasive cervical cancer in Africa with a frequency of 1.8 % of co-infection with other HPV genotypes [
29]. Additionally, it has been detected in patients with HIV with a frequency of 12.5 % in South Africa, Ghana, and Nigeria [
29], and has been also reported in anogenital carcinoma [
30]. In other studies, the prevalence of HPV-44 has been reported as ranging from 0.4–5 % [
31‐
34].
HPV-74 has been found in low-grade vaginal intraepithelial neoplasia in immunosuppressed women and has been related phylogenetically with HPV-6, 11, 44, and 55 [
35,
36]. We found this genotype only in some patients of the CIN I SI and CC MI groups. The number of reads obtained for HPV-74 was very low, suggesting a low prevalence. In this regard, low prevalence of HPV-74 has been detected by LiPA assay in genital warts in a German population [
34]; additionally, HPV-74 has also been reported as present in male populations from Brazil, Mexico, and the USA (by direct sequencing of products from nested PCR of the L1 region) [
37].
HPV-102 has been also found in this study only by NGS; we detected it in CIN I and CC, both in MI (Table
3). Despite that we found only one read in each group, we consider this genotype as present, because the sequences obtained are sufficient (337 and 411 bp) and identity was 99 and 100 %, respectively. Additionally, phylogenetic tree analysis places this genotype in
alpha species 3 (Fig.
3), as also reported [
36]. It is important to mention that we used all HPV reference sequences from the L1
alpha species for mapping. It has been reported that “simultaneous mapping” significantly decreases the read numbers obtained [
38], but we consider that it is better because more solid data are obtained. In agreement with our findings, this genotype has been detected in cervicovaginal cells from Hispanic females [
39].
Other additional genotype detected in our study, only by NGS, was HPV-114; this is a recently characterized papillomavirus isolated from a low-grade cervical lesion, and it has been found in low-grade samples with 1.7 % prevalence [
40]. We found this genotype in CIN I MI and CC, both SI and MI; its presence in CC suggest a possible oncogenic risk that needs to be evaluated individually in more samples. HPV-114 has also been detected by NGS in a Swedish cervical sample [
17], and has been found in one case of Vulvar intraepithelial neoplasia (VIN) [
41].
To complement and summarize our results, it was in our interest to performed a phylogenetic tree exclusively with sequences obtained by NGS from Mexican samples (Fig.
3); reconstruction shows a very similar pattern according to the classification of de Villiers [
36] despite using approximately 450 bp of the L1 region amplified by the PGMY09/11 set of primers. Bernard et al. in 1994 demonstrated that a short 291-bp fragment of the
L1 gene is sufficient to generate phylogenetic comparisons [
42].
It is accepted that in order to recognize a new Papillomavirus (PV) type, its complete genome must be cloned and its
L1 gene sequence must be different in 10 % from the closest type known; differences between 2 and 10 % represent a subtype, and differences <2 %, a variant [
43]. By analysis with the GS Reference Mapper, a variety of putative high-confidence nucleotide differences were determined, principally in HPV genotypes 31, 39, 45, 62, 66, 68, and 69. These nucleotide differences were found in two or more diagnosis groups (taking into account differences detected in at least 50 reads). Therefore, we continue to not discard the existence of HPV variants present in Mexico.
Regarding the classification of carcinogenicity for humans used by the International Agency for Research on Cancer (IARC) [
44], there are some HPV genotypes that are classified as “probably” or “possibly” carcinogenic or not classifiable. Among those, we found HPVs-11, 68, 70, and 69 only in CC samples. In contrast, HPVs-32, 44, 51, and 84 were observed in only the WCL and/or in the CIN I group. With regard to HPV genotypes not previously reported in Mexican population, HPV-74, 102, and 114 were detected in the CIN I and CC groups.
One limitation of this study was that groups of samples (according to the diagnosis), but not individually samples, were sequenced. Additionally, the HPV-genotypes detected were limited to the amplicons obtained by using the PGMY09/11 set of primers.
We continue to recruit and analyze additional cervical samples in order to corroborate and report these observations. Genotyping of HPV by NGS will permit the detection of more possible HPV variants, whose sequences could increase or diminish HPV recognition by the immune system or modify the oncogenic potential of HPV proteins [
45].
We continue to need more studies with methodologies such as NGS to determine the HPV genotypes and variants circulating in Mexican population and at the different stages during CC progression.
Methods
Sample collection
Approximately 250 cervical samples were obtained from the Centro Médico Nacional de Occidente (CMNO, IMSS) located in Guadalajara, Jalisco, Western Mexico. The women whose samples were taken attended a medical examination due to a cervical ailment. Diagnosis was first performed only by colposcopy observation (visual inspection using acetic acid and Lugol’s iodine solution); when cervical lesions/cancer were (was) observed, a biopsy was taken for diagnosis confirmation by histopathology. From the 250 samples obtained, 17 samples were excluded (those samples without sufficient quantity and quality of DNA), therefore, 233 samples were included in this study: 48 WCL; 98 CIN I, and 87 CC. Ages of the participating women ranged from 25–65 years. Samples were obtained by inserting a cytobrush into the endocervical canal, rotating it for 3–5 full turns, and placing it into ThinPrep PreservCyt® transport medium solution (Cat. no. 70097–002; Hologic, Inc., Marlborough, MA, USA).
HPV screening and Linear Array® HPV genotyping test
Total DNA was extracted and purified from each cervical sample using the AmpliLute Liquid Media Extraction Amplicor Kit (Cat. no. 03750540 190; Roche Molecular Systems, Inc., Branchburg, NJ, USA.). A first screening for HPV positivity was performed by conventional single-round PCR utilizing PGMY09/11 primers, which amplified a 450-bp fragment of L1 [
46]. HPV-positive samples were genotyped utilizing the Linear Array® HPV Genotyping test (Cat. no. 04391853 190; Roche Molecular Diagnostics, Pleasanton, CA, USA), which is capable of detecting simultaneously up to 37 HPV genotypes in a sample [
47]. This assay includes, as control, β-globin amplification in the same PCR reaction. According to the results obtained by LA, samples were subdivided into HPV-negative (−) or HPV-positive (+), and into Single- or Multiple infection (SI or MI, respectively).
HPV sequencing by 454 NGS
All HPV-positive samples identified by conventional PCR utilizing PGMY09/11 primers were included for MS. Amplicons for the NGS Library were generated by taking approximately 500–1000 ng of DNA from each sample and amplification by conventional PCR employing the following specifically designed primers:
Forward 5′-[Universal Multiplicom tail A: AAGACTCGGCAGCATCTCCA] – [individually PGMY11 primer]-3′, and
Reverse 5′-[Universal Multiplicom tail B: GCGATCGTCACTGTTCTCCA] – [individually PGMY09 primer]-3′. The PGMY sequences used were those reported in Table
2 in Gravitt et al. [
46]. Universal Multiplicom tails A and B were the same used in Cat. no. MR-0020.024, Multiplicom NV CFTR; Molecular Diagnostics, Niel, Belgium. PCR conditions included 1 cycle at 94 °C for 3 min, followed by 40 cycles at 94 °C for 45 s, 48 °C for 30 s, and 72 °C for 60 s, and a final cycle at 72 °C for 10 min. The PCR products (~450 pb) were electrophoresed in 1.5 % agarose gels, and cut and purified with the Wizard® SV Gel and PCR Clean-Up System (Cat. no. A9282; Promega, Madison, WI, USA).
Purified amplicons generated from each sample were quantified with the Qubit® dsDNA HS fluorometry assay Kit (Cat. no. Q32851; Life Technologies™, Eugene, OR, USA) to carry out 5-ng amplicon pools per sample. Amplicon pools were generated for the NGS Library, taking into account diagnosis, HPV negativity or positivity observed by LA, and the SI- or MI-HPV infection detected in a same sample. The following groups were generated: WCL (−), samples WCL-negative for HPV by LA; WCL (+), samples WCL-positive for HPV by LA; CIN I (−), CIN I samples negative for HPV by LA; CIN I SI, CIN I samples with single HPV infection; CIN I MI, CIN I samples infected with 2 or more HPVs, and CC SI and CC MI.
A second PCR reaction was performed to add the Multiplex identifiers (MID) barcodes, the key sequence (TCAG), and A or B sequencing adapters by using Multiplicom 454 MID Kits (Cat. no. ML-0008.192 and ML-0016.192, Multiplicom NV CFTR; Molecular Diagnostics) according to the manufacturer’s instructions. Amplicons where then purified with Agentcourt® AMPure® XP beads (Cat. no. A63881; Beckman Coulter Genomics, Danvers, MA, USA) and evaluated with the 2100 Bioanalyzer system (Cat. no. G2940CA; Agilent Technologies, Santa Clara, CA, USA).
The emulsion Polymerase chain reaction (emPCR) was performed with the GS Junior Titanium emPCR kit Lib A (Cat. no. 05996520001; Roche Diagnostics), and amplicons were sequenced in a GS Junior Sequencer utilizing the GS Junior Titanium Sequencing Kit (Cat. no. 05996554001) and the GS Junior Titanium PicoTiterPlate Kit (Cat. no. 0599661900), following the manufacturer’s instruction. All kits were supplied by Roche Diagnostics, Basel, Switzerland.
Data analysis
Quality control for sequencing data was performed, sequences with a Phred quality score of <20 were eliminated, and primer sequences were trimmed. The Roche GS Reference Mapper (ver. 2.9) was used to identify HPV genotypes by comparing fragment sequences with the L1 region of
alpha and
beta papillomavirus reference sequences, downloaded from Papillomavirus Episteme (PaVE) [
48]. Specific reads for each diagnosis group were identified in accordance with their associated MID barcode.
Analysis of phylogenic ancestry
Construction of a phylogenetic tree with consensus sequences of each HPV was performed. First, all consensus sequences were aligned utilizing MEGA6 software (employing the ClustalW ver. 1.6 algorithm), followed by the construction of the phylogenetic tree with the maximum likelihood method. Traditional tree-branch style was chosen and the Topology display was selected.
Competing interest
All of the authors declare no potential conflict of interest.
Authors’ contributions
MGF-M extracted DNA and performed PCR reactions. MGF-M and LA-R performed the LA test, and MGF-M, LAT-R, and VR-R conducted NGS. MGF-M, LAT-R, and SAR-M carried out the bioinformatics analyses; LMB-P was involved in patient interviews and sample recruitment. PP-S, VV-R, and EIC-G also participated in the general study design and critically revised the manuscript. AA-L and LFJ-S conceived of and designed the study, supervised all experiments and analyses, and wrote the manuscript. All authors read and approved the final manuscript.