Introduction
While a wide variety of self-report measures have been developed to assess adherence with HIV ART, few of the questionnaire items that make up these measures have been subjected to rigorous cognitive testing to ensure that the items are consistently understood by respondents. Accurate self-reports of medication could be useful in routine clinical care because research has consistently shown that physicians’ assessment of their patients’ adherence with ART is inaccurate [
1‐
4]. They could also be useful for research when more objective measures such as MEMS caps [
5] or unannounced pill counts [
6,
7] are impractical or too costly [
8,
9].
A number of self-report measures of medication adherence have been developed for chronic medical conditions such as hypertension and diabetes (e.g., Morisky), with different levels of validity testing [
10‐
13]. For HIV, a wider variety of instruments have been developed and used [
14].The validity of these instruments has been assessed, in general, by examining their relationship to laboratory outcomes, most commonly viral loads. Correlations with viral loads are consistently in the 0.3–0.4 range [
14,
15], and sometimes a little better. Previous work by our group showed that a rating item performed better than either a frequency item or a percent item using electronic drug monitoring (MEMS) as a gold standard [
16]. Subsequent work by others has confirmed this finding [
17,
18]. However, little is known about why certain items appear to perform better than others [
15], or whether further improvements can be made.
Another important issue for survey designers is whether it is necessary to ask about each of the individual medications that make up an antiretroviral therapy regimen, or whether one can ask about the regimen in the aggregate. Relatively few papers have attempted to assess differential adherence [
19‐
23]. While some of these studies suggest that it is not necessary to measure individual medications [
19,
20,
23], these were relatively small, single site studies, and other studies suggest that differential adherence may be consequential [
21,
22]. Thus it remains unclear whether the extra effort needed to measure adherence with each component of a regimen, which in the case of a three-drug regimen triples the respondent burden, is worthwhile.
To better understand why some items perform better than others, and to try to optimize the quality and performance of such measures, we conducted an extensive, iterative series of in-depth cognitive interviews with a socioeconomically and demographically diverse group of patients with HIV in Massachusetts and Rhode Island to find out how they understood the survey items. We then conducted pilot tests of the best items in over 350 patients who completed a pencil-and-paper version of the survey, and over 6,400 patients who completed an online version of the survey. The online version included a randomized test of whether responses differed if respondents focused on an individual medication or the antiretroviral regimen as a whole. We had three specific study questions: (1) Which item stems were most consistently understood by respondents and which response tasks could respondents use best to provide answers? (2) Can patients respond accurately to questions about their whole ART regimen or is it necessary to ask questions about individual pills in the regimen? (3) What are the psychometric characteristics of the resulting adherence measurement scales?
Discussion
There were three main findings from these analyses. First, the four rounds of cognitive testing allowed us to develop a set of items that we believe can be consistently understood by respondents from diverse sociodemographic and educational backgrounds. Second, the randomized experiment showed convincingly that the adherence scores did not differ between those asked about “all your HIV medications” and those who responded with a specific medication in mind. Third, in pilot testing the internal consistency reliability of three item scale was excellent.
We think results from cognitive interviewing can and should be more explicitly presented as a way of documenting the strengths and weaknesses of survey questions. Because in this case so many of the commonly used approaches to asking questions about medication taking proved unworkable in our cognitive testing, we thought that it would be useful to describe our findings—for example, the inconsistency with which patients understood concepts such as the “last week” or the “last month, the ambiguity of the term “as prescribed,” the confusion that visual analogue scales and percents created for many respondents, and the fact that patients were generally more comfortable using words than numbers in responses. While we tested these issues in two HIV care settings, we suspect that many of the findings would be generalizable to other populations being asked similar questions.
In previous work that used electronic drug monitoring as a reference we found that a rating scale performed better than other response sets [
16] in correlating with the objective measure of adherence, and other investigators using the same rating item have since reported similar findings [
17,
18]. We had previously speculated that the cognitive process of coming up with an adjectival rating appeared to correspond more closely to objective adherence data because it mapped more closely onto the cognitive process that patients used to form responses, which we thought was probably more an estimation process than an enumeration process. Our cognitive testing supports this theory. Only about half of respondents could demonstrate sufficient recall to describe details of their pill taking over a 30 days period. The majority were clearly estimating.
Interestingly, the rating item that we tested previously [
16] was worded as follows: “Thinking back over the last month, on average how would you rate your ability to take all your [HIV] medications as prescribed.” (response options very poor to excellent) Even though the performance of this item has been excellent, our cognitive testing showed that respondents did not have a consistent understanding of either the “last month,” “rate your ability” or “as prescribed.” This led us to modify the question in ways that led to the current version, “In the last 30 days, how good a job did you do at taking your HIV medications in the way you were supposed to.”
There is a small literature that addresses the issue of whether adherence differs in clinically important ways among the individual medications that make up HIV antiretroviral regimens. Wilson et al. [
19] using self-report measures from multiple individual antiretrovirals, concluded that patients tended to take (or not take) the individual antiretrovirals in their regimen as a group rather than taking some but not others at a given dosing time. McNabb et al. [
20] used electronic drug monitoring pill caps (Aprex) and found very little differential adherence for different medications scheduled to be taken at the same time. Deschamps et al. [
23] found little differential adherence using a self-report measure. Gardner et al. used pharmacy refill data and found that 15 % of patients in an unselected clinical population had “selective adherence,” defined as ≥5 % difference between two drugs in a regimen over an observation period of at least 60 days [
21]. In a subsequent paper from a randomized trial Gardner et al. used self-report to assess differential adherence. Adherence was assessed separately for each component of the regimen, and patients were classified as having differential adherence if the assessments disagreed. In the 60 month trial with assessments every 4 months, 29 % reported differential adherence at least once, and 10 % reported it more than once [
22].
While differential adherence clearly exists, and is probably clinical consequential [
22], this research addresses a practical measurement question, not a clinical question. We tested the hypothesis that we would get quantitatively different results if we asked about patients’ global adherence with their HIV medications than if we asked them to respond with reference to a single, individual medication. In a web-based trial that had over 3,000 responses in each arm, the difference in the three-item scale was 0.8 points on a 100 scale (91.0 vs. 90.2). While this was significant at the level of
p < 0.05, we do not think that this difference is clinically important.
We think the results of this web-based trial are useful because being able to ask respondents to report about their antiretroviral regimen overall is far simpler, and far less burdensome, than identifying individual medications and then asking questions about each one of them. The very small difference that we observed between arms was in the expected direction. Because we know that there is some differential adherence, we hypothesized that asking about non-adherence with three medications would be more sensitive than asking about non-adherence with a single medication. In short, we believe the data we present here supports the assertion that for most clinical and research applications it is reasonable to use these self-report items with reference to patients’ entire antiretroviral regimen. That said, for investigators interested specifically in differential adherence, these three items can be repeated for each of the pills in a regimen.
One of the primary challenges to measuring any socially desirable behavior by self-report is avoiding ceiling effects. That is, to the extent possible, one would like to avoid the problem of respondents over-reporting their adherence in such a way that all or most score at the top of the scale. This type of measurement error results in a measure with little useful variation to explore analytically. Of course, if the surveyed respondents were in fact highly adherent, then ceiling effects would be a function of the true underlying behavior rather than a type of measurement error. In this study, because we have no objective adherence measure with which to compare our adherence measures, we cannot identify the portion of our measurement that is error, and we cannot directly compare our findings with other studies that use related items in different populations [
16‐
18]. However, in very recent reports, full or excellent adherence with current antiretroviral regimens has been reported in many settings to be quite high [
28,
29].
This study has several limitations. First, it was not possible to do cognitive testing of all of the different items and response scales that have been used for self-report of medication adherence. We used judgment to select which items to test, and it is possible that other items or approaches would also have fared well in cognitive tests. Second, although we purposefully conducted our cognitive testing in a sociodemograpically diverse sample of persons with HIV, it is possible that testing in other populations would yield different results. Third, our web-based sample was largely well educated, gay, white men, and it is possible that the findings from the randomized experiment we conducted would have been different in a different population. Arguing against this is the fact that the distributional and psychometric characteristics from the clinic-based sample were strikingly similar to the web-based sample despite very different sample characteristics. Fourth, we cannot assess whether participants in the web-based survey did so fraudulently, as some have described [
30]. Fifth, while we have presented findings from cognitive and psychometric testing, until we complete testing currently underway that includes an objective adherence measure as a comparator, we cannot make any statements about the validity of items or the scale. Finally, the use of these items in non-English speaking settings will require both careful translation and back translation [
31,
32] and additional cognitive testing.
In conclusion, through detailed cognitive testing we have developed a new, short set of medication adherence self-report items. Next, in a large field test, we found that asking patients to report on adherence with their whole antiretroviral regimen produced similar results to asking them about individual medications. The three items and the resulting adherence scale had good distributional characteristics and an excellent Cronbach’s alpha. Both the lessons from our cognitive testing and the resulting items should be applicable to self-report of other medications used chronically for other conditions. Formal validity testing is underway, and rigorous testing of these items in a variety of other settings is encouraged.