Dependent variables
HPV persistence, understood as the detection of the same HPV genotype in two or more consecutive intervals.
Viral clearance, understood as the negative detection of the same genotype in a consecutive interval, following a positive sample (first negative PCR result after an incidental infection).
SIL persistence: histopathological diagnosis of (low- or high-grade) SIL after a first SIL diagnosis. Persistence of the same lesion in two or more successive liquid cytologic studies.
SIL regression: histopathological diagnosis for SIL negative after a first (low- or high-grade) SIL diagnosis. Regression of the lesion in two or more successive cytologic studies and subsequent healing.
Progression of (low- or high-grade) SIL: histopathological diagnosis of high-grade SIL or cancer in situ after a first diagnosis of SIL (low- and high-grade, respectively).
Main independent variable: levels at the cervix of the immune biomarkers IL-10, IL-4, TGFβ1, IFNγ, IL-6, and TNFα, expressed in pg/mL.
Analysis of cytokine expression and of HPV E6 and E7 oncoprotein expression at the cervical level
Total RNA isolated from cervical exudate will be used to synthesize cDNA; this will be performed in the presence of 200 U of reverse M-MLV transcriptase and 2.5 μg of total RNA under standard conditions. PCR will be carried out in a reaction volume of 25 μL containing 1 μL of cDNA, dNTPs 0.2 mM, 15 pmol of each oligonucleotide, 2.5 μL of reaction buffer and 1 U of recombinant Taq DNA polymerase. The constitutive gene of GAPDH (250 bp) will be used to verify DNA integrity. PCR will be carried out in a Mastercycler PCR gradient thermocycler (Eppendorf, Germany) under the following conditions: 5 min at 94 °C, 35 one-minute cycles at 94 °C, 1 min at 60 °C, and 1 min at 72 °C, with a final extension step of 10 min at 72 °C. Amplification products will be resolved by electrophoresis in a 6% polyacrylamide gel and visualized under ultraviolet light after staining with ethidium bromide.
Expression probes to analyze the expression of the cytokines IL-10, IL-4, TGFβ1, IFNγ, IL-6, IL-2, and TNFα and the expression of the HPV E6 and E7 oncoproteins will be obtained from Applied Biosystems for real-time PCR analysis. The HPRT1 (hypoxanthine phosphoribosyl transferase) gene will be used to normalize the amount of mRNA in each sample to analyze IL-10, IL-4, TGFβ1, IFNγ, IL-6, and TNFα. The GAPDH gene will be used to normalize the amount of mRNA in each sample to analyze the expression of the HPV E6 and E7 oncoproteins.
Real-time PCR will be performed by adding 2 μL of each cDNA sample to a final reaction volume of 10 μL, containing 5 μL of Master Mix for expression, 0.4 μL of probe, and 2.6 μL of molecular grade, DNase-free water. Amplification cycles will be carried out in an Applied Biosystems VIA-VII equipment (Foster City) under the following conditions: 10 min at 94 °C, 40 one-minute cycles at 94 °C, 1 min at 54 °C, and 1.5 min at 72 °C, with a final extension step of 15 min at 72 °C. The level of mRNA expression for the genes under study will be calculated by relative quantification with the comparative Ct method (2-ΔCt) and plotted as expression relative units of each gene relative to the endogenous gene (HPRT-1 or GAPDH) and to the group of comparison. All samples will be analyzed in duplicate.
Data collection and statistical analysis
A descriptive analysis of the sociodemographic and gynecological-obstetric characteristics, familial history of cancer, and lifestyle-related variables in the population under study will be performed. The questionnaire will include, among other variables: sociodemographic characteristics such as age, marital status, religion, education level, smoking habit (years, number of cigarettes per day, whether the habit is old or current, previous years of tobacco use), and socioeconomic level; for this variable, an index (low, medium, and high tertiles) will be constructed using the analysis of main components for the population included in the cohort, with information on household floor materials and availability of tap water, washing machine, refrigerator, television, radio and stove; gynecological-obstetric traits such as the number of sexual partners, regular partners (defined as sexual activity with that person for at least 6 months), age at menarche, age at start of active sex life, parity, hormonal contraception (years of duration, current situation), history of sexually transmitted diseases, condom use, genital hygiene, previous HPV infection, previous local treatment for a cervical lesion, and familial history of cancer, including type of cancer and consanguinity.
For continuous variables, expressed as a mean ± standard deviation, the Kruskal-Wallis test will be used. For categorical variables, expressed as a percentage, the chi-square test will be used. Results will be regarded as statistically significant for P < 0.05. All data will be analyzed with STATA v.14 for Windows. Missing data will be addressed by using Maximum likelihood, multiple imputation, and inverse probability weighting and analyzed via multilevel mixed-effects linear regression models.
To evaluate the levels of the immune biomarkers IL-10, IL-4, TGFβ1, IFNγ, IL-6, and TNFα and the expression levels of the proteins E6 and E7 in the cervix, the non-parametric Mann-Whitney U test will be used. A curve of diagnostic performance (ROC-receiver operating characteristic) will be plotted to obtain the cut-off point of greater discrimination with respect to the variable of evolution (viral persistence, viral clearance, SIL incidence, SIL persistence, SIL progression. Once obtained, the sensitivity, specificity, positive predictive value, and negative predictive value for the cut-off points of each variable will be calculated. With respect to viral load, the Wilcoxon rank sum test will be used to measure the difference in median viral load for each HPV type, according to the infection status (HPV persistence or HPV clearance). All possible 2-way interactions between cervical levels of each immune marker and the viral load, and its association with HPV persistence, will be evaluated by adding multiplicative terms in the multivariate logistic models.
The incidence and cumulative incidence rate of each outcome (HPV persistence, HPV clearance, SIL incidence, SIL persistence, SIL progression) will be determined, adjusting for changing levels of the immune biomarkers IL-10, IL-4, TGFβ1, IFNγ, IL-6, and TNFα and the expression levels of E6 and E7 in the cervix.
To assess the association of viral persistence or clearance, SIL incidence, regression, or progression with cervical levels of the immune biomarkers IL-10, IL-4, TGFβ1, IFNγ, IL-6, and TNF, as well as with E6 and E7 expression levels and with cervical viral load, a Cox regression analysis will be performed, adjusting for the co-variables age, number of sexual partners, age at menarche, age at start of active sex life, parity, smoking, hormonal contraception, history of other STIs, conservative clinical treatment, viral genotype, co-infection with two or more HPV genotypes, and duration of incidental HPV infections. The viral load will be included in the analysis as the maximum viral load reached during an incidental infection for each viral group. The endpoints for the analysis will be persistence or viral clearance and SIL incidence, regression, or progression.
The Kaplan-Meier method will be used to estimate the median duration of infection for most HPV types and for each previously defined viral group. Infections will be considered as persistent when their duration is longer than the median duration of the infection. A longitudinal approach will be applied to cluster all possible triplets of consecutive visits per individual, to compare the results of this measure of persistence with that obtained by using the traditional persistence definition (i.e. two consecutive positive samples).
To evaluate the association of persistent infection with the risk to develop cervical intraepithelial neoplasm (CIN) grade-1 or grade-2/3, a Cox regression analysis will be performed, adjusting for cofactors relevant to the infection, like smoking and co-infection, along with the levels of the immune biomarkers IL-10, IL-4, TGFβ1, IFNγ, IL-6, and TNFα, the expression levels of E6 and E7 and the viral load in the cervix, and co-infection with two or more HPV genotypes. The endpoints for the analysis will be the histopathological diagnosis of CIN 1, CIN 2, CIN 3, or carcinoma in situ.
According to the results of HPV genotyping, type-specific viral clearance rates in single and multiple infections will be compared with the stratified log-rank test. This test will also determine the probability of viral clearance among HPV variants. The effect of co-infection by HPV genotype on incidence rates for CIN 1 and CIN 2/3 in women with simple and multiple infections will be assessed by the Cochran Mantel-Haenszel test stratified by age, the result of cytologic studies, and HPV type.