Background
Immunological correlates of protection are measurable and specific biological markers which correlate with protection against disease caused by an infectious pathogen. The markers used are most often pathogen-specific neutralizing antibodies whose concentration can be measured with biological assays [
1]. Researchers and agencies responsible for immunization recommendations, such as the US Advisory Committee for Immunization Practices and the World Health Organization, rely on established threshold values for the immunological correlate of protection where the accepted threshold differentiates between individuals who are considered to be immunologically protected against disease and those who are susceptible [
2,
3]. When it is strongly correlated with protection with a recognized threshold, it can be called an absolute correlate [
4].
Uses for the established threshold for a correlate of protection are numerous. For instance, where the correlate has been established for a vaccine that has already demonstrated clinical efficacy against disease, the correlate simplifies study of the vaccine in new populations, age- or risk-groups by permitting clinical trials to be conducted with immunogenicity endpoints and avoiding large-scale efficacy trials. The US Food and Drug Administration (FDA) offers accelerated approval when there is a correlate (FDA prefers the term “surrogate”) that is considered “reasonably likely” to predict clinical benefits [
5]. Other uses include the study of immunogenicity for co-administration with other vaccines, comparisons of combination vaccines to individual component vaccines and assessment of the duration of protection. The established correlate of protection also permits comparisons of new generation vaccines to older ones. For completely novel vaccines, the demonstration of a candidate immunologic correlate is becoming a secondary yet fundamental objective in clinical trials and epidemiological studies. This is encouraged by agencies such as the US FDA Center for Biologics Evaluation and Research and is one of the Grand Challenges in Global Health [
6]. Thus the accurate identification of protective threshold levels clearly has important implications for the licensure of vaccines and for immunization policy.
Research in correlates of protection is multidisciplinary. As a consequence, terminology used has been inconsistent and sometimes confusing. There have been recent efforts to harmonize the terminology employed and to link this to a hierarchy of statistical evidence for the demonstration of a correlate [
4,
7,
8]. In addition the terminology has been further refined by introducing the terms mechanistic and nonmechanistic to address whether the correlate of protection is causal or not [
9]. We will here for convenience use the term ‘correlate of protection’ in the broadest sense, to include immunological assays that have been consistently shown to correlate with risk of disease, assays that have been shown to be causally associated with protection, or specific threshold values of assays which have been accepted or proposed as differentiating susceptible from protected individuals. We also use the term ‘protective threshold’ to refer to an assay value for the correlate that distinguishes protected and unprotected individuals when the relationship between the correlate and protection can be reliably and usefully summarized with a single threshold value. However, individual variability means that at any threshold value some above will be susceptible and some below protected, and ‘protective threshold’ is not intended to imply any particular level of protection, and specifically is not intended to imply complete protection or ‘sterile immunity’. ‘Assay value’ and ‘titer’ are used interchangeably, according to context. A general opinion is emerging that improvement in statistical methods is needed [
10,
11] for identifying correlates of protection, but opinions vary on the appropriate statistical methodology. Methods and study designs have varied historically and across disease areas resulting in different standards of data quality and statistical methods to establish correlates of protection and their threshold values.
For older vaccines, the protective immunological thresholds have often been determined based on observational data, which was sometimes conveniently available or opportunistic. For example, Björkholm et al. measured diphtheria antitoxin titers in 44 individuals admitted to hospital during a diphtheria epidemic among alcoholics in Sweden and observed that 7 of 10 patients who had diphtheria antitoxin titers < 0.01 IU/ml died or showed neurological complications, whereas 33 out of 34 diphtheria carriers with antitoxin titers ≥ 0.16 IU/ml remained symptom-free [
12]. Further in vitro studies suggested that titers between 0.01 and 0.09 IU/ml may be regarded as giving basic immunity, whereas a higher titer of 0.1 IU/ml was considered fully protective [
13].
When an outbreak of measles occurred among students in a dormitory at Boston University, Chen et al. obtained permission to assay samples of blood donations made shortly before the start of the outbreak and compared their antibody concentrations with the occurrences of measles [
14]. Of 9 donors with detectable pre-exposure plaque reduction neutralization titer less than or equal to 120, 8 met the clinical criteria for measles compared with none of 71 with pre-exposure titers greater than 120. Similarly, Neumann collected sera from 238 high school students on Prince Edward Island before a measles epidemic sweeping the rest of Canada reached the island to compare infection rates by titer [
15].
An early study by Goldschneider et al. established a protective threshold for meningococcal C disease based on serum bactericidal assay [
16]. American army recruits provided blood samples for assaying at the start of basic training, and disease occurred in only 1% of individuals who had titers greater than 4 of SBA at recruitment compared to 22% of those who had less than 4. This was further confirmed by a population study that demonstrated an inverse relationship between disease incidence and the presence of SBA titers.
These early studies and others [
17] selected protective thresholds based on inspection of disease rates observed in discrete intervals of assay values with confidence limits never reported. Siber provides an in-depth discussion of this approach [
18] and introduces the idea of titer-specific degrees of protection.
For newer vaccines, clinical trials or observational studies specifically incorporate immunological data collection to identify potential thresholds, and statistical approaches have accordingly been developed for this purpose. For instance, in the Chang-Kohberger method data from three double-blind controlled trials in Northern Californian, American Indian and South African infants were pooled in a meta-analysis to derive a protective threshold of 0.35 μg/ml for anticapsular antibodies for a 7-valent pneumococcal conjugate vaccine against invasive pneumococcal disease [
19,
20]. The statistical method equates relative risk of invasive pneumococcal disease between vaccine and control groups to the relative risk of having antibody concentration below the protective threshold, and the protective threshold is then found from cumulative distribution curves of the antibody concentrations of the vaccinated group and the control groups. The threshold has been endorsed by a WHO Working Group and has subsequently been used to develop and license a newer generation 13-valent vaccine [
21].
It was essentially this same method that was employed by Andrews et al. to derive a threshold for a correlate of protection following meningococcal C vaccination [
22]. The two modern examples for pneumococcal and meningococcal C vaccines that employed the Chang-Kohberger method, however, required an estimate of vaccine efficacy based on a clinical endpoint before the method could be used.
Few other statistical methods exist for identifying a threshold. The idea of estimating separate disease probabilities
a and
b below and above a threshold has been proposed by Siber et al. but no actual model was developed to estimate the threshold [
20].
Other statistical approaches have focused on continuous models, which do not explicitly model a threshold. Logistic regression has frequently been used [
23‐
28]; other continuous models have included proportional hazards [
29] and Bayesian generalized linear models [
30]. Chan compared Weibull, log-normal, log-logistic and piecewise exponential models applied to varicella data [
31]. A limitation of such models is that they cannot separate exposure to disease from protection against disease given exposure, the latter being the relationship of interest. A scaled logit model which separates exposure and protection where protection is a continuous function of assay value has been proposed [
32]. The scaled logit model was illustrated with data from the German pertussis efficacy trial data [
27] and has been used to describe the relationship between influenza assay titers and protection against influenza [
33‐
35]. However, these approaches do not explicitly allow identification of a single threshold value.
Thus despite the fundamental reliance on thresholds in vaccine science and immunization policy, previous statistical models have not specifically incorporated a threshold parameter for estimation or testing. In this paper, we propose a statistical approach based on the suggestion in Siber et al. [
20] for estimating and testing the threshold of an immunologic correlate by incorporating a threshold parameter, which is estimable by profile likelihood or least squares methods and can be tested based on a modified likelihood approach. The model does not require prior vaccination history to estimate the threshold and is therefore applicable to observational as well as randomized trial data. In addition to the threshold parameter the model contains two parameters for constant but different infection probabilities below and above the threshold and can be viewed as a step-shaped function where the step corresponds to the threshold. The model will be referred to as the a:b model.
Discussion
Despite the central importance of threshold values in vaccines research and immunization policy, only the Chang-Kohberger method [
19,
20] has been previously proposed to estimate thresholds from assay values and disease occurrence data, but its estimation requires information on vaccinated and unvaccinated groups. The a:b method provides a reliable, readily applicable method for finding a threshold for paired data of the form {
y
i
,
t
i
} for which previous models and associated statistical testing were limited. The a:b model provides the same estimate as the maximal chi-square method [35] when least squares estimation is used.
The statistical criteria available for the evaluation of a threshold estimated by the a:b model are confidence interval width and location, goodness of fit, significance testing and relative risk. A number of factors are likely to influence the width of confidence intervals, including the presence of a clear, high step in the data and the number of subjects and cases of disease in the dataset. Further, bootstrap confidence intervals based on the candidate values of tau are affected by the density of distinct observed assay values in the region of the threshold. This is a data limitation arising from the assay technique which generates discrete rather than continuous titer values, with lower densities (fewer distinct assay values) tending to produce wider confidence intervals and higher densities allowing the possibility of smaller confidence intervals. The location of threshold point estimates and upper and lower confidence limits in some datasets suggested that profile likelihood estimates may be higher and therefore more conservative, requiring higher antibody titers to be achieved to conclude protection, compared to least squares estimates.
Goodness-of-fit p-value in some instances was clearly consistent with the bar plots of the binned data while in other cases this was less so, possibly due to discreteness in the data resulting from small numbers of cases of disease. Visual inspection of graphical representations of the data might routinely supplement statistical assessments.
Because the estimated threshold itself does not imply the degree of protection, relative risk aids in its interpretation. If a threshold is to separate susceptible from protected individuals, relative risk may be seen as a measure of the degree of protection and can be employed as one of the criteria for assessing the relevance of an estimated threshold in addition to the p-value from the test for significance. For example, the Swedish pertussis FHA IgG result produced a p-value of 3.5×10-4 but a relative risk of 0.508, implying around 50% reduction in risk, which may question the acceptability of the threshold as higher protection is generally expected in vaccine preventable disease.
Ideally, all assessment criteria would provide consistent results in support of a threshold. However, instances were noted where other conclusions might be warranted even though some statistical assessments were promising. For example, for the White/varicella data, there is a small confidence interval for the threshold, the p-value for the threshold is highly significant and the relative risk acceptable (close to 0.1) but the goodness-of-fit is poor (p = 0.085). It was found that that this data is better fitted by a continuous scaled-logit model (p for goodness-of-fit = 0.999), suggesting that a relative rather than absolute threshold may be appropriate.
The threshold in the a:b model is the titre value that best separates the sample of patients into two groups with different but constant infection rates, but this does not require the ‘protected’ group to have a specified low probability of infection. It is therefore possible that the protected group defined by the estimated threshold has a high probability of infection, like 20% in the pertussis PT IgG example, which could be deemed to be unacceptably high if one’s definition of a threshold requires low risk of infection. Therefore, an additional criterion that sets a maximally acceptable probability of infection amongst the protected group could be considered in addition to statistical tests when evaluating thresholds.
Although definitions of thresholds may differ, it is encouraging to note that others’ published estimates of thresholds for these same datasets are not dissimilar to estimates from the a:b model, suggesting consistency with others’ notion of an acceptable threshold. For instance, a previous analysis of the White/varicella data identified a gp ELISA titer of 5 U/mL to indicate protection, which is now reported to be an ‘approximate correlate of protection’ for varicella vaccines [
39]. The estimate was consistent with our profile likelihood estimate of the threshold of 5.011 (95% CI; 2.584; 5.011). For the Swedish pertussis data, a putative threshold value of 5 units/mL for PRN, FIM and PT were found to be associated with high protection [
28]; subjects having all three had even higher protection. However, while the authors applied the same putative threshold to all 3 pertussis components, we estimated different values for each: 5.477 (95% CI; 1.414;15.49) for PT, 5.950 (95% CI; 2.298;15.92) for PRN and 7.650 (95% CI; 1.249;7.846) for FIM. For the German pertussis data, a regression tree approach found that a threshold value of 7 units/mL for PRN IgG was most predictive of protection [
23]. We estimated a threshold of 13.165 (95% CI; 1.375;29.31) with profile likelihood and 7.665 (95% CI; 0.855;13.17) using least squares. Amongst the subset of subjects achieving 7 units/mL for PRN, those who had 66 units/mL of PT IgG had even greater protection. Our estimated threshold for PT IgG using profile likelihood was 1.385 (95% CI; 0.965;1.390), but this figure is not comparable to the previous figure of 66 unit/mL which should be interpreted as a conditional threshold given that protective PRN levels are achieved.
Because the a:b model assumes constant rates of infection on each side of the threshold, which may be a strong assumption, we considered in supplementary analyses more flexible models which allowed linear, quadratic or logistic relationships on either side of the threshold. However, these models did not produce fits corresponding with the expectations of a correlate of protection. For instance, a step-down of infection rate at the threshold value and non-increasing rates of infection on either side of the threshold were not always observed. The a:b model was always consistent with these expectations. In addition, visual examination of the profile likelihood for these other models did not show sharp peaks corresponding to the optimal threshold value, and were associated with wider confidence intervals resulting in greater uncertainty of the threshold value. In general these more flexible models could not be relied upon to consistently find a threshold which could be said to differentiate protected from susceptible individuals.
The a:b model presented here does not require vaccination information to estimate a threshold. While this is an advantage, it is also a weakness given that the a:b model can provide only the first level of information in the hierarchy of evidence to demonstrate a statistical correlate of vaccine efficacy in the framework described by Qin et al. [
7]. To provide a higher level of evidence, the a:b model could be developed to include a vaccination parameter and an associated test. Also, further development could allow for multiple co-correlates in which two or three threshold values are estimated simultaneously. This could have application to diseases like pertussis where more than one antigen is necessary for the fullest protection or for new vaccines that protect against multiple serotypes of a disease, such as pneumococcal infection or dengue. Further research might also compare different statistical models for correlates of protection – the a:b model, the method of Chang and Kohberger [
19‐
21], the scaled logit model [
32‐
35], a linear trend model and logistic regression – and the conclusions reached by each for levels of protection.
In order to investigate correlates of protection and thresholds, there are also clinical and immunological considerations. A correlate must include a clearly defined clinical endpoint, whether protection is afforded against infection, disease, severe disease, infectiousness, carriage or other condition. For instance, it is thought that protection against pneumococcal infection requires progressively lower thresholds for protection against pneumococcal carriage, otitis media, pneumonia and invasive pneumococcal infection [
40]. Similarly, standardized laboratory assays and tests for disease case confirmation are also needed but not always feasible, which can potentially introduce bias in laboratory confirmed disease cases in some studies. An assay must first be selected by immunologists and validated according to immunological criteria – sensitivity, specificity, reliability, and freedom from inter-technician variability. It may be of interest to know whether the specific immune response measured by the assay is responsible for protection; statistical methods for causal inference have recently been developed allowing an assay to be selected which has been shown to be causally associated with protection [
41,
42]. Other considerations include: host factors in which the immune system changes throughout life implying different immune response by age, temporal immunological factors such as timing of measurement and kinetics of the immune response, and population factors given that observed thresholds may not be universally applicable to all settings. Thus, once a correlate of protection or threshold is proposed, further discussions with stakeholders are necessary to cover these disease-specific considerations that the statistical methods alone cannot address.
A final practical requirement is that datasets to identify immunological correlates of protection are essential. Vaccine efficacy trials provide a clear opportunity to collect data on the relationship between assay values for candidate correlates of protection and disease occurrence; however, they are often sized inadequately to yield convincing conclusions on correlates of protection. Typically trials are designed to capture 40–100 cases of disease to convincingly demonstrate adequate vaccine efficacy against placebo [
43‐
45], but such trials are generally underpowered for assessing correlates of protection. Incorporation of a correlate of protection objective in clinical trials can incur substantial expense to the trial as it would require additional bleeds in subjects after they receive vaccine or placebo to observe their assay values and before any significant number of disease cases occur. Furthermore, more refined titer measures (i.e. less discrete data) would require more serial dilutions and greater blood volumes.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
All authors contributed to the formulation of the research question, made methodological suggestions for consideration and evaluation by the group, and contributed to the interpretation of the results. XC, FB and AD performed the statistical calculations and KD and AD drafted the manuscript. All authors read and approved the final version.