Introduction
Patient reported outcomes (PRO), such as quality of life (QOL) or health-related QOL (HRQL), are commonly used endpoints in clinical studies and therapeutic trials in patients with pulmonary diseases. Instruments that assess PRO focus on the perceptions of patients with the condition of interest; as such, they generate meaningful data on disease effects not captured by other outcome measures.
HRQL instruments are generic or disease-specific. The merit of disease-specific instruments is that they contain only items pertinent to patients with the disease of interest. Because of this, disease-specific instruments tend to be more responsive than generic instruments to underlying change. Disease-specific HRQL instruments have been developed for a number of pulmonary conditions, including chronic obstructive pulmonary disease[
1‐
3] and asthma,[
4,
5] but not for idiopathic pulmonary fibrosis (IPF).
IPF is a progressive, fibrosing, parenchymal lung disease[
6] with distinctive pathophysiological processes. IPF has no reliably effective therapy, and survival rates are worse than for many cancers [
7]. In people with IPF, dyspnea limits physical activity, and hypoxemia ultimately develops, requiring patients to use supplemental oxygen. Given these discomforting aspects and the poor survival rates, it is not surprising that generic HRQL in patients with IPF is impaired [
8,
9]. Because IPF lacks a cure, there is a great deal of interest in maintaining or improving HRQL, so patients can live with acceptable QOL for however long they survive. Without a disease-specific instrument, there will continue to be uncertainty regarding whether relevant aspects and effects of the disease are being measured adequately and whether drug therapies, or other interventions, have a net beneficial or adverse impact on HRQL. In this manuscript, we report on the development an IPF-specific HRQL instrument called the ATAQ-IPF (A Tool to Assess QOL in IPF) version 1.
Discussion
We have developed the ATAQ-IPF version 1, an IPF-specific HRQL questionnaire. We used direct patient inquiry to generate an item pool, and we used rigorous statistical methods to reduce item numbers and construct an instrument that contains items tapping domains specifically relevant to patients with IPF.
In Phase I of item reduction, we deleted items with skewed response distributions--this serves the goal of maximizing the power of the ATAQ-IPF to discriminate between respondents with different degrees of HRQL impairment--and reduced item numbers by nearly half. We subjected the remaining items (in their domains and in aggregate) to Rasch analysis. The retained items--by virtue of fitting the Rasch model, like all items that fit the Rasch model--are guaranteed to have the same measurement characteristics as concrete physical measures (e.g., length or weight). Thus, by incorporating Rasch analysis into the development of the ATAQ-IPF, unlike other HRQL questionnaires for which Rasch methodology was not used, we can be confident that it adheres to the basic tenet of arithmetic: 'one more unit means the same amount extra, no matter how much we already have' [
20]. So, an increase of one point for an ATAQ-IPF domain or total score means the same thing whether a respondent has severely impaired or near-normal HRQL. This linearity that the Rasch model constructs differs from the assumed linearity of classical test theory and much of item response theory--methodologies used to develop the majority of HRQL instruments [
21].
By running Rasch analyses on clusters of items formulating each domain, we were able to pare down items in a systematic fashion. By dropping poor-fitting items, or certain ones from groups with identical logit positions (that only serve to make the questionnaire longer and not necessarily enhance the ATAQ-IPF's power to discriminate between respondents whose status changes over time), we were able to shorten the length of each domain.
The detailed and carefully executed item reduction techniques we used have not been implemented in the development of many other HRQL instruments. Generating content for the ATAQ-IPF, by directly capturing patients' perspectives and using them to build the framework (and specific items) of the questionnaire, ensure its content validity. Involving IPF patients in the development process ensures that all relevant themes and effects are tapped. It is the incorporation of such perspectives that makes the ATAQ-IPF uniquely applicable to IPF patients and not necessarily to patients with other forms of lung disease. Further, including only items that fit the Rasch model guarantees each of the ATAQ-IPF's scales (domain and total) maintain their additive properties. To our knowledge, only one other investigator has used this type of approach in the development of respiratory disease-specific HRQL instruments [
2,
3].
Psychometric testing revealed that domains and the overall instrument possess excellent internal consistency reliability [
16]. Domain-total correlations confirmed that each domain measures some aspect of the same underlying construct--HRQL--and that each contributes information about HRQL unique from the aggregate contribution of the other items. The ATAQ-IPF, then, functions like an arithmetic test that has individual sections that assess addition, subtraction, multiplication, and division: the test score portrays overall arithmetic ability but the sections can point to areas in which a student might excel or need additional instruction. Likewise, the ATAQ-IPF overall scores serves as a measure of global HRQL, and the domain scores can be used to examine more closely the nature of the impact of an intervention on HRQL.
The significant correlations between domain scores and FVC%, DLCO%, and 6MWD showed that ATAQ-IPF scores are related to--but also yield their own unique information from--clinically meaningful, commonly used measures of IPF severity. Results from the linear regression analysis add more weight: in a model that controlled for arguably the two most important physiologic measures used to assess IPF patients (FVC% and DLCO%), those measures combined to explain only 25% of the variability (R-square = 0.25) in the ATAQ-IPF total score. Thus, there are factors not captured by these physiologic measures that contribute to HRQL in patients with IPF. Interestingly, there was moderately strong correlation between DLCO% and the Social Participation, Independence, and Sexual Health domains, and there were significant correlations between 6MWD and these domains as well as with the Relationships domain. These results indicate that gas exchange and functional capacity influence more than simply physical well-being, and they underscore the importance of extending HRQL measures to include such domains in patients with IPF.
Investigators commonly view significant associations between HRQL scores and clinical measures of disease severity or functional status as evidence for the validity of an instrument; however, the importance of such associations is primarily in understanding which manifestations of a disease have the greatest effects on HRQL--they are much less relevant to validity. So, although such correlations in this study confirmed our hypotheses that HRQL would be related to IPF severity (as measured by these physiologic variables), the validity of the ATAQ-IPF (or any other instrument) is best judged over time on three other terms: 1) its content--whether it covers all the relevant dimensions on which individuals evaluate their HRQL, or at least those that might be affected by the disease in question; 2) whether items require respondents to indicate the extent to which their QOL (on the various domains) is compromised by their disease; and 3) whether resulting scores are reliable, sensitive, and responsive to change. The ATAQ-IPF certainly meets terms 1 and 2, and further investigation will determine term 3. As with any HRQL questionnaire, validity is not achieved (or even determined) in a single study--it is built. It is only through observing the performance of a questionnaire in multiple studies over time that we can confidently say that it measures what it was intended to measure. That said, the results of the analysis in which we examined differences in ATAQ-IPF scores between subjects not using and those using supplemental oxygen support the validity of the ATAQ-IPF: subjects using supplemental oxygen had more dyspnea and exhaustion, less independence, required more forethought, and had greater impairments in emotional well-being, social participation, sexual health, relationships, and overall HRQL (according to the ATAQ-IPF total) than subjects not using supplemental oxygen.
Although 74 items comprise version 1 of the ATAQ-IPF, this number of items enables it to tap myriad important constructs and to report scores at the domain level. Whether item number can be reduced further, without unacceptable loss of content or reliability, requires additional investigation. Moving forward, we will use the ATAQ-IPF as a secondary outcome measure in a longitudinal study, and we invite other investigators to use the ATAQ-IPF version 1 in their studies as well.
Acknowledgements
The authors wish to thank and acknowledge Michael Gould, MD, MS; Susan Jacobs, RN, MS; Michael Linacre, PhD; Milton Rossman, MD; Anita Stewart, PhD; David Streiner, PhD; and Janelle Yorke, PhD for their assistance and thoughtful input at various stages of this project.
Competing interests
JJS is supported in part by a Career Development Award from the NIH (K23 HL092227). The authors declare that they have no competing interests
Authors' contributions
Study conceptualization: JJS, SW. Data collection: JJS, DS, KB. Data analysis: JJS, SW, KG, FW. Writing and final approval of manuscript: JJS, SW, KG, DS, KB, FW.