Background
Regular physical activity is an important component for maintaining a healthy lifestyle and an essential factor for prevention of osteoporosis. Yet, despite the well-known benefits of regular activity, surveys found more than 60% of adults do not engage in regular exercise and 31% do not participate in any activity [
1]. A systematic review published by our group reported adherence rates to exercise in people with osteoporosis to be between 52 to 100% [
2]. One method that might increase exercise adherence is to understand the factors that affect the motivators, barriers, and preferences to physical activity and employ methods to leverage facilitators and preferences and limit barriers to create customized exercise programs [
1]. Questionnaires are the most frequently used method of data collection in the field of rehabilitation science and the most feasible option to survey large populations [
3,
4]. These self-report questionnaires may be one method to collect data regarding factors that affect exercise adherence. Understanding the factors affecting exercise adherence may help develop targeted interventions that increase the quality and delivery of physical activity programs in the research setting and in clinical practice. [
4]. A growing body of literature has examined levels of physical activity among different populations using self-reported questionnaires and there is an increased interest to integrate patient-reported outcomes into clinical practice [
3].
Exercise is widely recommended to reduce the effects of osteoporosis, falls, and related fragility fractures and a number of systematic reviews found weight-bearing exercises help maintain or increase bone mineral density (BMD) in the hip and spine of women with low bone mass [
5‐
8]. The effects of exercise are not only concentrated in reducing the consequences of osteoporosis but also play an important role in improving daily activities [
9]. A recent systematic review found exercise also improves activities of daily living (e.g., dressing, bathing, etc.) in participants with osteoporosis [
9].
We previously described the developmental process and content validity of the Personalized Exercise Questionnaire (PEQ); a self-reported survey that assesses the motivators, barriers, and patient preferences to exercise [
10]. Although a previous tool (the Exercise Benefits/Barriers Scale or EBBS) exists, it does not cover some of the most frequently reported barriers in older adults such as lack of interest, lack of transportation, pain, disliking going out alone, etc. The EBBS also has minimal focus on the specific type of exercise that would be preferred, and so the PEQ was developed from a number of systematic reviews, expert advice, and participant feedback to address these issues [
10]. In a previous paper, the PEQ demonstrated high content validity of individual items (I-CVI range: 0.50 to 1.00) and moderate to high overall content validity (S-CVI/UA = 0.63; S-CVI/Ave = 0.91) among healthcare providers [
10]. This article describes the sequential steps in the testing of the PEQ using data collected from patients with low bone mass or osteoporosis. The purposes of this study were to describe the:
1.
Cross-sectional construct validity by testing differences between two or more groups with expected differences to establish known-group validity [
11];
2.
Test-retest reliability of individual items of the PEQ by measuring the stability of an item’s response over time [
11].
Discussion
There is now strong evidence that regular exercise can improve health related outcomes in adults and older adults and there is emerging data for significant psychological and cognitive benefits accrued from regular exercise [
22]. The Canadian Physical Activity Guidelines recommend adults aged 18 to 64 accumulate at least “150 minutes of moderate–to-vigorous intensity aerobic physical activity per week and at least 2 days per week of muscle and bone strengthening activities” [
23]. However, in 2013 just over two in ten Canadian adults ≥18 years of age met the physical activity guidelines [
24]. To gain a better understanding of the issues associated with physical inactivity, this study aimed to validate and determine the reliability of the PEQ as a tool to assess the barriers and the facilitators to exercise.
Using the PEQ to understand the factors that influence exercise behaviours may be one method to increase adherence and create a more individualized exercise program. Despite the challenges in validating a questionnaire that captures different facilitators, barriers, and preferences we were able to provide preliminary support that the PEQ is able to provide valid and reliable information on these aspects. Validity has to be established through multiple evaluations of content, construct, and where possible criterion validity. In a previous paper, we described the development of the PEQ and the need to create this tool to address the gap in the literature [
10]. Known-group validity is a form of construct validity where hypotheses are pre-specified and then tested to reflect whether a tool is able to differentiate where differences are expected a priori. Where a statistical difference is found, it supports the validity of the tool and where differences are not significant, either the tool/item is flawed, the hypothesis flawed, or the power inadequate.
The first hypothesis tested whether participants working full-time are more likely to report lack of time as a barrier to exercise. This premise was strongly supported in the results and the phi coefficient (effect size) suggested a strong difference between these two groups supporting the validity of question 34. Past studies report a lack of time is a major barrier to physical activity participation [
2,
25] but one study found lack of time appears to be an excuse rather than a true reason for not being active [
20]. Approximately 28 h of leisure time were spent per week doing sedentary activities such as watching television, reading for pleasure, napping, and sitting quietly [
20]. This item may help clinicians identify working individuals who have difficulty balancing exercise and work demands and incorporating time management strategies to assist participants with integrating exercise into a busy schedule.
The second hypothesis suggested no difference in exercise group sizes between older and middle-aged adults corroborating that item 22 measures the construct it claims. Although previous papers suggested that older adults prefer to exercise alone rather than in a group-based setting, recent findings challenge that literature, and new studies have found older adults prefer group-related interventions among people their own age [
17]. One reason why older adults may have suggested solitary exercise programs in previous literature is their perceived view that exercise classes tend to be populated by individuals younger than them [
17]. Beauchamp et al. (2007) found older adults prefer exercising in a group setting with individuals their own age [
17] and adherence levels tend to be far superior when done in groups compared to alone [
25‐
27]. Future exercise designs should use this item to determine group size preferences for an exercise program and based on the majority, design an exercise program where participants either exercise alone or with other individuals. Since older adults prefer to exercise with people their own age, having an instructor of a similar age to the participants may also help participants feel more comfortable to exercise.
The inverse relationship between SES and physical inactivity has been well demonstrated empirically in the literature [
15,
16,
18,
19,
28,
29]. We hypothesized that participants from a lower SES would report cost as a barrier, however, found no association between these two groups. Although the hypothesis was not validated in this study, we doubt the item itself is flawed. Recently, three large systematic reviews emerged questioning this relationship [
30‐
32]. In these reviews, both higher and lower SES groups reported being physically active but the higher SES group was more likely to report leisure-time physical activities such as going to the gym [
30] while those in the lower bracket reported housing or occupational physical activities such as cleaning or construction work [
31]. Taken together, it is possible that neither the item nor the hypothesis are unreliable since the type of physical activity was not specified. In addition, none of the systematic reviews were able to claim that individuals of higher SES are more active than those in the lower group. More than half of the participants were retired or not working due to disability and reported an income less than $50,000. After removing the retired respondents from the known-group validity test, there were still no differences between groups. Other possible explanations may be that social supports available through the Canadian government for low-income families can reduce the burden of access to exercise facilities and alleviate some of the costs regarding exercise programs. This is still an important item to evaluate and researchers and clinicians should be aware of subsidies that can influence financial costs of an exercise program.
Environmental correlates of physical activity have gained attention over the last decade and include accessibility to a facility, aesthetic attributes, and safety features [
15]. The validity of this item is important since the results provide evidence that the item measures what it is supposed to. Environment is hypothesized to influence behavioural intentions based on a meta-analysis that found individuals with a more positive attitude toward their environmental surroundings were more likely to accomplish their intended behaviour [
33]. Thus, environmental barriers should not be ignored when designing future exercise programs and promoting adherence. Designing exercise facilities that are safe and aesthetically pleasing may be a simple way to encourage exercise behaviours and the PEQ can be used to identify this.
The PEQ demonstrates moderate test-retest reliability with some domains having better reliability than others. Although some items had a low kappa score this does not necessarily indicate a low confidence rating in the item if it has a high absolute agreement score. An item’s reliability may be questioned when both the absolute agreement and the kappa score are low. Interestingly, even though the test-retest setting was different, where the first survey was completed in the clinic and the second at home, most items demonstrated a moderate to high reliability.
Questions 2 (healthcare’s attitude toward exercise) and 3 (friends/families attitude toward exercise) had the lowest scores in the first domain, which might indicate a hidden problem. It has been reported that 79% of Canadians see a physician more frequently than any other healthcare provider, however, physicians and nurses have the least knowledge and confidence regarding exercise and exercise prescriptions compared to other healthcare provider [
34]. Although physicians may want to encourage an active lifestyle, their lack of knowledge and confidence to prescribe exercise may have been reflected in the respondents’ answers. About 28% of participants selected a different answer the second time and there was no pattern to the selection process; a few participants selected “not sure” the first time and “yes” the second, while others selected “yes” the first time and “no” the second. A similar situation may be happening with the respondents’ family and friends. Participants’ family and friends may also believe exercise is important, but may fail to convincingly persuade active participation in exercise.
Questions 4 and 5 regarding the location of an exercise facility and transportation demonstrated “no agreement” and “slight agreement”, respectively. In question 4, the absolute agreement calculation showed 98% of participants selected the same answer in both rounds and the reason for the discrepancy between the unadjusted level of agreement and kappa may be known as the Kappa Paradox. In this paradox, analysis may show a high value for the absolute agreement and a drastically low kappa score [
35]. Although a maximum attainable kappa (k
m) is suggested to fix this imbalance, it may not solve the paradox [
35]. Thus, even though question 4 has a low kappa, this does not represent the true precision of the item. Item 5 also demonstrated low reliability. The absolute agreement calculation showed 77% of respondents selected the same answer in both rounds. This item may be indicating that transportation needs fluctuate on a daily bases. The majority of respondents were over the age of 60 and depend on family or friends to assist them. Transportation has been listed as one of the major barriers to exercise in older adults and in the osteoporosis population [
36,
37]. Although the reliability of this question is low, it is important to examine the dynamics of this barrier.
Weighted kappa was used to determine the reliability of each item in section 3, which ranged from fair to almost perfect agreement. The lowest subscale scores were in questions 11 (able to walk longer) and 12 (more flexible). Participants may have had more time to think about their goals and reflect on each item since the second questionnaire was completed at home. Older adults leave, rejoin, and switch exercise classes as their commitments and interest change with time and one longitudinal study following 541 participants found 21% dropped out of an exercise program and joined a different program over 3 years [
38]. For this reason, exercise goals should be reassessed frequently and individuals should be given the opportunity to try out different programs.
Section four had a reliability score for each item that ranged from moderate to substantial agreement. Question 23 regarding learning proper techniques had the lowest reliability score, which was expected since it had nine options. For this item participants selected one or two more items the second time. Overall, respondents’ answers were not very different from the first round, differing by just one or two choices.
Section five regarding feedback and tracking had the highest reliability, and each item ranged from substantial agreement to almost perfect agreement. Interestingly, the majority of participants that selected “yes” to receiving feedback also selected “yes” to providing feedback and tracking, while the same pattern was seen for those who selected “no”.
The last section, regarding barriers to exercise had a reliability item score that ranged from substantial agreement to almost perfect agreement. There was a general trend where, the second time, participants checked one or two additional barriers. This also could have happened because respondents had more time to think about their barriers while completing the PEQ the second time. .
Although ceiling and flooring effects can be an important consideration for outcome measure questionnaires they are less of a concern for the PEQ since the purpose is to identify the facilitators, barriers, and preferences to exercise. While we were concerned with whether the questionnaire failed to identify these traits, ceiling and floor analyses were not the best way to assess the performance of this type of questionnaire. For example, one barrier is not necessarily a floor effect if it prevents the person from exercising. Similarly, one significant facilitator may offset many smaller barriers, so, for this reason, ceiling and flooring effects would be difficult to interpret. While it may be mathematically possible to calculate ceiling and flooring effects, its interpretation may not be clinically significant.
Despite the substantial work done to validate the PEQ, its usefulness as a tool to devise facilitators, barriers, and preferences to exercise still needs more evaluation. A limitation of this study is that we only evaluated construct validity of 4 items, and so, these results cannot be assumed to generalize other items, although not all items are appropriate for known-group analysis. The next step should test the validity of the remaining questions in the osteoporosis population. One method to test validity is to use a subclass of construct validity such as convergent or discriminant validity. For example, convergent validity for questions 2 (healthcare attitude toward exercise) and 3 (family/friends attitude toward exercise) can be validated with the normative beliefs domain in the Theory of Planned Behaviour Questionnaire. Similarly, entire sections such as domain 3 (my exercise goals) can be validated with the Goal Content for Exercise Questionnaire and question 32 (“I do not exercise as often as I like because:”) and 35 (“do weather conditions stop you from exercising”) can use convergent validity analyses to correlate items on the Self-Efficacy for Exercise Scale. Concurrent validity should not be used to validate the PEQ since this type of validity compares items to a known standard and there are no recognized tools that measure facilitators, barriers, or preferences to exercise in older adults [
10].
After confirming the validity of all items in the PEQ, next steps should test this questionnaire in the osteoporosis population and identify some of the major facilitators and barriers and assess different methods to leverage the motivators and limit the obstacles to exercise. Some barriers, such as being in a wheelchair, would require researchers and clinicians to work with their participants to find unique methods to mitigate these barriers in an exercise program. Studies using the PEQ can customize programs and determine its effectiveness to improve exercise adherence in clinical trials. It is also important to train and educate researchers and clinicians how to use the PEQ and help them understand the different factors that affect adherence. In order to see the full benefits of the PEQ, it is important that researchers and clinicians work together with the participants to find solutions to these factors that affect adherence.
Strengths and limitations
Strengths of this paper include a sample that met sample size calculations, all patients had a diagnosis from a single rheumatologist and a single independent evaluator conducted all the data collection. Although this paper conformed to the highest standards of work, it is not without limitations. Our test-retest sample size was estimated at 46, however only 42 surveys were returned. It is unlikely that 4 more responses would have changed our conclusions, but some imprecision in our estimates is possible.
The PEQ was developed and tested using the southern Ontario population who were mainly Caucasian, so its validity, reliability, and generalizability in other ethnic or religious groups are unknown and geographical factors that affect exercise adherence should also be tested. These issues should be addressed in formal cross-cultural validation studies. This study also recruited more women than men, which could potentially impact the generalizability of the findings to males and many participants were retired or not working due to disability and their reported earnings may have not reflected accurately their true SES. Lastly, we did not collect information on those that declined to participate, which may indicate important differences in their facilitators, barriers, and preferences towards physical activity.