Introduction
Treatment with combination antiretroviral therapies provides durable viral suppression among people living with human immunodeficiency virus (PLHIV) type 1 [
1]. However, lifelong daily oral therapy can be burdensome, which can impact adherence and increase the risk of treatment failure [
2]. A two-drug combination regimen comprising intramuscular (IM) injections of cabotegravir (CAB) and rilpivirine (RPV), administered every month or every 2 months, is approved for the treatment of HIV-1 infection in adults [
3]. This may offer improved acceptability, adherence, and treatment satisfaction compared with the current standard of care with daily administration of an oral regimen (current antiretroviral therapy) [
4]. PLHIV experience with HIV treatment regimens in terms of treatment satisfaction, acceptance, and health-related quality of life (HRQoL) is an important consideration given the need for lifelong treatment [
4,
5]. Currently, the available patient-reported outcome (PRO) tools capture HRQoL [HIV/acquired immune deficiency syndrome (AIDS)-Targeted Quality of Life (HAT-QoL)] and treatment satisfaction [HIV Treatment Satisfaction Questionnaire (HIV-TSQ)] among PLHIV; however, PROs that specifically measure acceptance and experience with injectables are lacking for this population. Such PRO tools will be of increasing importance as HIV treatment modalities continue to change.
To fill this gap, the Perception of Injection (PIN) questionnaire was derived from the Vaccinees’ Perception of Injection (VAPI) questionnaire [
6] and adapted for gluteal IM injection [
7] in PLHIV. The PIN questionnaire was developed to provide crucial information on experience with injectable therapies, including acceptance of pain, injection-site reactions (ISRs), and tolerability following injections in PLHIV. As a next step, it is important to evaluate the PIN’s psychometric properties in a population of PLHIV receiving long-acting IM antiretroviral therapy.
This post hoc analysis aimed to evaluate the psychometric properties of the PIN questionnaire in PLHIV using data from two phase III studies [First Long-Acting Injectable Regimen (FLAIR; NCT02938520) and Antiretroviral Therapy as Long Acting Suppression [ATLAS; NCT02951052)] both of which have previously demonstrated the non-inferiority of a long-acting IM CAB plus RPV combination every 4 weeks compared with a standard of care, daily, oral three-drug regimen in PLHIV [
8,
9]. As part of both studies, participants completed multiple questionnaires, including the PIN questionnaire, to assess their acceptability, tolerability, acceptance of pain, and ISRs following injections every 4 weeks [
7].
Methods
Study Design and Participants
This was a post hoc statistical analysis conducted using data from participants who received the monthly long-acting injectable combination treatment (CAB + RPV) as part of the FLAIR and ATLAS studies. The study design and eligibility criteria of the FLAIR and ATLAS studies have been previously described [
8,
9]. Briefly, FLAIR was a randomized (1:1), multicenter, open-label, non-inferiority study, wherein PLHIV could either continue their current daily oral antiretroviral therapy or switch to long-acting CAB + RPV therapy administered every 4 weeks for 100 weeks. PLHIV were enrolled if they had not previously received antiretroviral therapy and if they had achieved virologic suppression with daily oral antiretroviral therapy during the induction phase of the study [
8]. ATLAS was a randomized (1:1), multicenter, parallel-group, open-label study, wherein participants could either continue their current daily oral antiretroviral therapy or switch to long-acting CAB + RPV therapy administered IM into the gluteal muscle every 4 weeks for 52 weeks. Enrolled participants were PLHIV who had been receiving antiretroviral drugs in an uninterrupted regimen without virologic failure and without a change in medication for at least 6 months prior to screening [
9].
PRO Measures
As part of the two studies, PLHIV completed several PRO questionnaires at various time points, as described in the Supplementary Material (Fig. S1). The PIN questionnaire was completed at weeks 5, 41, and 48. Respondents were asked to consider the prior week (post-injection) at weeks 5 and 41, and the prior 4 weeks at week 48. Because the PIN questionnaire is injection-specific, and participants had no experience with the injection under investigation at study entry, the questionnaire was not administered at baseline. Pooled data from both the ATLAS and FLAIR trials were used for this post hoc psychometric validation of the PIN questionnaire.
The PIN questionnaire included items that assess participant experience with injections. The PIN measure was derived from the VAPI questionnaire and adapted for PLHIV receiving long-acting CAB + RPV. VAPI was chosen as the model as it had previously been validated and found to be a reliable tool for the assessment of vaccine injections [
6]. Additionally, VAPI was designed to evaluate aspects of the participant experience similar to those intended to be explored in PLHIV, meaning that only minor changes were required in the development of the PIN questionnaire. These include replacement of the term ‘vaccination’ with ‘injection’ throughout the questionnaire, a change in item 3 from ‘pain in your arm’ to ‘pain in your buttock’, and changes in items 11 and 15 where ‘lifting your arm’ is replaced by ‘walking’. The PIN questionnaire contains 21 items, grouped into four multi-item domains: ‘Acceptance of ISR’ scale score (2 items); ‘Bother from ISR’ scale score (6 items); ‘Leg movement’ scale score (4 items); and ‘Sleep’ scale score (4 items); and five individual items: pain during injection; anxiety before injection; anxiety after injection; willingness to be injected in the future; and overall satisfaction with mode of administration. Participant responses were scored on a 5-point Likert scale with 1 representing the least favorable perception of injection and 5 the most favorable; domain scores were calculated as a mean of all items in the domain. Other PRO instruments used in analyses are briefly described below.
The 12-item Short Form Health Survey (SF-12) is a measure derived from the Medical Outcomes Study 36-Item Short Form Health Survey, containing the same eight domains that assess general health status and mental health distress, with responses collected on a graded scale or as Yes/No answers [
10].
Overall function and well-being were evaluated with the shorter 14-item, 3-dimension version of the HAT-QoL questionnaire, which covers life satisfaction, feelings about medication, and disclosure worries, with participant responses collected on a 5-point scale ranging from ‘All of the time’ to ‘None of the time’ [
11].
Treatment satisfaction was evaluated using the 12-item HIV-TSQ status and change versions [HIV-TSQ(s, c)] [
12,
13], while treatment acceptance was measured using the 3-item ‘General acceptance’ dimension of the ACCEPT questionnaire, which assesses how participants weigh advantages and disadvantages of long-term medication [
14].
Post-injection pain was assessed using a numeric rating scale (NRS), whereby a participant selects a whole number between 0, ‘no pain’ and 10, ‘extreme pain’, that best reflects the intensity of pain at injection.
Assessment of Psychometric Properties
Following the relevant PRO Food and Drug Administration Guidance for Industry [
15], the psychometric properties of the PIN were investigated to determine the instrument’s adequacy in terms of reliability, validity, and responsiveness. This included the quality of questionnaire completion (including the extent of missing data), the extent to which the possible range of responses for each item is selected (item-level analysis), the intercorrelation of items that contribute to a score (internal consistency reliability), score stability over time when change is not expected (test–retest reliability), assessment of whether relationships among items, domains and concepts conform to a priori hypotheses (construct validity), and whether changes over time in individuals or groups are reflected in a difference in score (responsiveness/sensitivity to change).
Quality of Completion and Item-Level Analysis
The frequency and percentage of response choices and missing data were determined for each PIN questionnaire item at weeks 5, 41, and 48 of the study. Floor and ceiling effects were explored for each of the PIN questionnaire items by calculating the proportion of participants with the minimum possible score (floor) and the proportion of participants with the maximum possible score (ceiling) at weeks 5, 41, and 48.
Internal Consistency Reliability
Internal consistency reliability reflects the extent to which individual items are consistent with each other in the same dimension and reflect a single underlying concept. Item–total correlations (correlations between each PIN questionnaire item and the total score after omitting the item) were calculated. A minimum coefficient of 0.30 was used as the benchmark to denote a moderate correlation [
16]. Internal consistency was investigated by calculating Cronbach’s alpha coefficient for each PIN questionnaire domain at week 5, with a coefficient ≥ 0.80 considered as evidence for strong internal consistency [
16], but a coefficient ≥ 0.70 was considered acceptable [
17]. Coefficients ≥ 0.90 were flagged as indicating potential redundancy.
Test–Retest Reliability
Test–retest reliability (i.e., the extent to which the questionnaire yields the same scores each time it is administered, all other variables being stable) was evaluated for all PIN items by calculating the intra-class correlation (ICC), using the ICC (3, 1) coefficient as per the definition of Shrout and Fleiss [
18]. Test–retest reliability was assessed between weeks 41 and 48 among stable participants, defined as participants who indicated stability over time on both the SF-12 and the HAT-QoL between weeks 24 and 48. Test–retest reliability was interpreted as follows: ICC < 0.50 poor reliability, ICC 0.50–0.74 moderate reliability [
19], ICC 0.75–0.89 good reliability [
16,
20,
21], and ICC 0.90–1.00 excellent reliability [
22].
Construct Validity
Convergent validity (i.e., the extent to which the scores correlate with scores from other PRO instruments measuring similar concepts) was assessed between the PIN questionnaire domain scores and the domain scores of HAT-QoL, SF-12, and ACCEPT at week 48, as well as between the PIN questionnaire domain scores and the score from the post-injection pain NRS at weeks 5 and 41 on an individual participant basis. Correlations were evaluated based on pre-defined ranges: 0.10–0.29 (weak correlation), 0.30–0.49 (moderate correlation), and 0.50–1.00 (strong correlation) [
23,
24].
Confirmatory factor analysis (CFA) was performed using week 5 data to summarize how well the variables reflected the hypothesized structure. Model fit was assessed using comparative fit index (CFI; values ≥ 0.90 indicated acceptable fit) [
25], standardized root mean residual (SRMR; values < 0.10 were considered acceptable; values < 0.08 were preferable) [
26], and root mean square error of approximation (RMSEA) models (values < 0.06 were considered acceptable) [
27].
Known-groups validity (i.e., the extent to which the instrument is able to distinguish clinically different groups) was performed to determine the degree to which the PIN questionnaire was able to discriminate between participant severity groups hypothesized a priori to be different. For this, a comparison was made between the PIN item and the domain scores in participants grouped according to pain severity based on the 10-point post-injection pain NRS at weeks 5 and 41. The statistical significance of differences in scores between groups was calculated using one-way analysis of variance.
Responsiveness
Responsiveness (i.e., sensitivity of the questionnaire to detect change over time) was determined by assessing change between weeks 5 and 48 in PIN questionnaire items and domain scores for participants showing change on the HIV-TSQs between baseline and week 44 and for participants showing change on the HIV-TSQc between weeks 5 and 48. Effect sizes were calculated to determine the sensitivity to change over time of the PIN domains. Based on Cohen, effect sizes (d) of ≥ 0.80 were considered large, ≥ 0.50 medium, and ≥ 0.20 small [
28]. Responsiveness of the PIN was also assessed with correlation coefficients where values of 0.10–0.29 were classified as weak correlations, 0.30–0.49 as moderate, and 0.50–1.00 as strong correlations according to widely accepted conventions [
23,
28].
Data Analysis
The psychometric analysis sample included all participants who provided at least 1 valid response to a PIN questionnaire item at any time point. No imputation or replacement of missing data was performed for the psychometric analyses. Data processing and analysis were conducted by ICON plc using SAS software (SAS Institute, Cary, NC, USA, v.9.4).
Compliance with Ethics Guidelines
This post hoc psychometric analysis was conducted on anonymous data and no direct participant contact or primary collection of individual human participant data occurred. Study results are in tabular form and aggregate analyses that omits participant identification, therefore informed consent or ethics committee or IRB approval were not required.
The PRO data used in this analysis were a secondary endpoint in two clinical trials: ATLAS and FLAIR, both of which were conducted in accordance with the principles founded in the Declaration of Helsinki and with Good Clinical Practice. The ATLAS and FLAIR protocols were approved by an institutional review board or ethics committee of each study site.
All participants provided written informed consent to participate in the clinical trials and any subsequent analysis of the data as it was derived from those clinical trials.
Discussion
As summarized in Table
4, this post hoc analysis provides support for the psychometric properties of the PIN questionnaire in a population of PLHIV treated with an IM injected two-drug combination therapy (long-acting CAB + RPV) in two phase III trials [
8,
9]. Item-level analysis indicated floor effects for most PIN items, where participants reported little to no bother and few, if any, symptoms associated with the injections, hence the quality of completion can be considered moderate. In assessing the PIN’s internal consistency reliability, item–total correlations were generally strong and indicated high item homogeneity, with an overall Cronbach’s alpha coefficient of 0.92 and values for domain scores ranging from 0.80 to 0.92. These values indicate strong internal consistency; however, the overall values above 0.90 may indicate high item homogeneity [
29], suggesting that some items can be removed to reduce redundancy without impacting validity of the questionnaire. In particular, the Leg movement and Sleep scale showed a strong internal consistency with Cronbach’s alpha of 0.91 and 0.92, respectively, potentially indicating item redundancy. Similarly, Cronbach’s alpha with each item omitted ranged between 0.88 and 0.89 for Leg movement scale and 0.89–0.90 for Sleep scale, suggesting that all items contribute similarly to the scale. The strong internal consistency reliability and homogeneity found in this study, and therefore likely item redundancy of some PIN items, could be further explored to reduce redundancy and/or respondent burden in future analysis.
Table 4
Summary of psychometric evidence for the PIN questionnaire
In assessing construct validity for the PIN in terms of the correlations between the domain scores of PIN and the domain scores of other PRO instruments, the PIN demonstrated moderate correlations with other relevant clinical outcome assessment (COA) measures defined a priori. In addition, construct validity was also assessed for the PIN in terms of its ability to differentiate between different populations when a group difference is expected (known-groups validity). Convergent/discriminant validity were generally as expected; correlations between all PIN questionnaire domains and the post-injection pain NRS were strong or moderate. There was strong support for the instrument’s known-groups validity: at both time points (weeks 5 and 41), the PIN was able to differentiate between severity groups as defined by levels of post-injection pain on the NRS. Correlations between the PIN domains ‘Bother from ISR’ and ‘Acceptance of ISRs’ and the ACCEPT score were found to be weak, contrary to expectations. It is generally assumed that individuals who perceive injections negatively are less likely to be satisfied with the treatment, compared with those who perceive injections more positively. However, since PROs measuring general acceptance and satisfaction with therapy are multifactorial and explore several parameters of treatment (such as convenience, flexibility, and lifestyle fit), there are probably confounding factors influencing correlations between PIN and those instruments, requiring further work to draw firm conclusions. Correlations were also weak between the PIN questionnaire domain scores and the SF-12 Physical Component Summary, the HAT-QoL ‘Disclosure Worries’ domain and the HAT-QoL ‘Life Satisfaction’ domain as expected, indicating discriminant validity. Overall, both convergent/discriminant validity and known-groups analyses provide moderate to strong validity evidence for the PIN.
Responsiveness of the PIN was moderately demonstrated, and effect sizes for the PIN domains were mainly medium.
In terms of limitations, the design of the phase III studies was not optimal for use in a psychometric validation for several reasons. Specifically, the 7-week period between assessments was much longer than the typical 4-week period to allow assessment of stability over time, and hence test–retest reliability was not applicable. Additionally, the length of time between medication administrations, and the fact that PRO instruments were not administered at the same time as the PIN questionnaire, were not optimal for evaluating responsiveness. As a result, CFA showed moderate support for the instrument’s domain structure, and further investigation may be needed to explore alternative scale structures and scoring. Also, test–retest reliability analysis demonstrated relatively low ICCs for individual PIN questionnaire items, indicating moderate to poor reliability, and the responsiveness could only be partially explored. While our study provides a preliminary indication on the responsiveness and test–retest reliability of the PIN questionnaire, future studies should be conducted to further confirm the PIN performance. Generalizability of the results to PLHIV with virologic failure may also be limited. However, the participant experience of injections, particularly pain, may not be influenced by virologic status, and this participant group (with virological suppression) may be more sensitive to dimensions captured by the PIN given that the switch to long-acting CAB + RPV was a choice, which would not be the case for participants with virological failure, for whom virologic suppression, safety, and efficacy would remain of greatest importance.
The strengths of this study were in the use of the merged ATLAS and FLAIR trial datasets, which provided a relatively large sample size based on a diverse population in terms of age, education, and years since HIV onset, allowing for planned psychometric tests and robust results. Additionally, a wide range of COA instruments included in the dataset could be used as reference measures for assessing various psychometric properties, and, since the data were drawn from interventional clinical trials, sensitivity to change over time could be tested over the duration of the study.
Conclusion
This analysis points towards the reliability, validity, and responsiveness to change for the PIN questionnaire in PLHIV. As such, the PIN questionnaire may provide valuable evidence as a key endpoint in pivotal trials, capable of assessing acceptance of pain, ISRs, and tolerability following injections in PLHIV, which could have important implications for treatment adherence in this population.
Acknowledgements
The authors would like to thank Nicolas Van de Velde, a former employee of ViiV, for his contributions to the study. We also thank the participants of the ATLAS and FLAIR studies.