Background
Methods
Development of the draft instrument and pre-testing
Literature review
Focus groups
Variables | Focus Groupsa | CDIsb | Survey 1 (N = 2020) | Survey 2 (N = 1640) |
---|---|---|---|---|
Sex, n (%) | ||||
Male | 109 (47.6) | 42 (47.7) | 932 (46.1) | 792 (48.3) |
Female | 120 (52.4) | 46 (52.3) | 1088 (53.9) | 848 (51.7) |
Age (years), Mean ± SD | 39.7 ± 12.7 | – | 45.0 ± 17.4 | 42.9 ± 16.3 |
18–25 years, n (%) | 34 (14.8) | 27 (30.7) | na | na |
26–50 years, n (%) | 136 (59.4) | 34 (38.6) | na | na |
51–65 years, n (%) | 59 (25.8) | 26 (29.5) | na | na |
18–30 years, n (%) | na | na | 560 (27.7) | 509 (31.0) |
31–45 years, n (%) | na | na | 636 (31.5) | 544 (33.2) |
46+ years, n (%) | na | na | 824 (40.8) | 587 (35.8) |
Race, n (%) | ||||
Caucasian | 1628 (80.6) | 1309 (79.9) | ||
African-American | 152 (7.5) | 128 (7.8) | ||
Other | 240 (11.9) | 203 (12.4) | ||
Education Level, n (%) | ||||
High school and less | 68 (29.7) | 30 (34.1) | 705 (34.9) | 634 (38.7) |
Some college and more | 142 (62.0) | 58 (65.9) | 1315 (65.1) | 1006 (61.3) |
Other | 19 (8.3) | – | – | – |
Smoking Status, n (%) | ||||
Adult smoker with no intention to quit | 71 (31.0) | 22 (25.0) | 437 (21.6) | 408 (24.9) |
Adult smoker motivated to quit | 39 (17.0) | 22 (25.0) | 461 (22.8) | 408 (24.9) |
Adult former smoker | 62 (27.1) | 22 (25.0) | 516 (25.5) | 407 (24.8) |
Adult never smoker | 57 (24.9) | 22 (25.0) | 606 (30.0) | 417 (25.4) |
Expert opinion
-
Perceived Health Risk to Self. The perceived negative risk (or impact) of product use to the user’s physical health, ranging from minor immediate concrete manifestations of health risk (e.g., having poor gum health) to more serious long-term ones (e.g., having lung cancer);
-
Perceived Addiction Risk. The perceived negative risk (or impact) that product use may have on the user’s sense of being addicted to using the product;
-
Perceived Health Risk to Others. The perceived negative risk (or impact) to the physical health of nonsmokers when being around during product use (not to be confused with the category of general risk, i.e., the risk of active use of tobacco products for active users in general);
-
Perceived Social Risk. The perceived negative risk (or impact) that product use will affect interpersonal interactions adversely or how the user is perceived by others;
-
Perceived Practical Risk. The perceived negative risk (or impact) that product use may have on the user’s time and finances.
Item generation
Cognitive debriefing interviews
Pilot field testing
Psychometric evaluation
Design and procedure
Measurements
Data analysis
Property | Definitions and Acceptability Criteria |
---|---|
Targeting | Targeting refers to the extent to which the range of the target construct measured by each of the scales (i.e., perceived health risk and perceived addiction risk) matches the range of that target construct in the study sample. Better targeting equates to a greater ability to interpret the psychometric data with confidence [50]. This involves examination of the relative distributions of the item locations and the person measurements as well as of the plot of the person-item location distributions, showing the item locations and the person measurements on a common scale. There is no specific criterion. Essentially, the item locations should cover the sample adequately and the sample should cover the item locations adequately. |
Fit | The items of the scales of the proposed instrument must work together (fit) as a conformable set, both conceptually and statistically. Otherwise, it is inappropriate to sum item responses to a total score and consider the total scale score as a measure of the target construct. When items do not work together (misfit) in this way, the validity of the scale is questionable [50]. The following statistical and graphical indicators of fit were investigated [51]: • Item discrimination: Fit residuals summarize the difference between observed and expected responses to an item across all respondents (item-person interaction). Fit residuals should ideally lie within ±2.5. Fit residuals lying outside this range imply misfit of the observed data to the Rasch model. Negative values indicate overdiscriminating and positive values underdiscriminating items. Due to the large sample size in Surveys 1 and 2 it was to be expected to find a substantial number of item misfits, but this indicator was still considered helpful as some items were expected fitting much worse than others. • Item fit: Chi-squared values summarize the difference between observed and expected responses to an item for groups (or ‘class intervals’) of individuals with relatively similar levels of ability (item-trait interaction). A chi-squared value with a low likelihood (p-value) implies that the discrepancy between the observed responses and the expected value is large relative to chance for that item. • Item response ordering: This involves the examination of the category probability curves (CPCs) and the threshold probability curves (TPCs) which show the ordering of the thresholds for each item. A threshold marks the location on the latent continuum where two adjacent response categories are equally likely. The ordering of the thresholds should reflect the intended order of the categories lower (‘no risk’) to higher (‘high risk’) values. Correct ordering supports the assumption that the response categories work as intended. Disordered thresholds indicate that the response categories for a particular item are not working as intended, and therefore that the scoring function for that item is not valid. • Local independence: This involves an examination of item residual correlations [52]. Correlations between the residuals should be low (< 0.30). In addition, residual correlations are assessed against the average of all residual correlations plus 0.3 [53, 54]. If residuals for item pairs are correlated > 0.30, this indicates that the response to one item depends on the response to the other item, i.e., the items are locally dependent [55]. |
Reliability | Reliability refers to the extent to which scale scores reflect random error [56]. This was assessed using the person separation index (PSI), which is an internal reliability statistic comparable to Cronbach’s alpha. The PSI quantifies the error associated with the measurements of individuals in the sample [56]. The PSI ranges from 0 (all error) to 1 (no error). A low PSI implies that scale items are not able to reliably separating individuals on the scale they define. |
Stability | Comparability of PRI measures across different factors was based on tests of invariance (key criterion of successful measurement), implying that items mean the same to different participant groups under different conditions. This is assessed by means of a test for differential item functioning (DIF) [57]. Invariance was assessed according to demographic criteria (age, gender, education) as well as across different tobacco and nicotine-containing products, different subpopulations based on smoking status and across the application of the scales to perceived personal risk and perceived general risk. DIF is assessed by comparing observed residuals (i.e., the difference between expected responses under the assumption of no DIF and actually observed responses) across groups of participants defined by the DIF factor investigated (e.g., males versus females) and classified in several class intervals along the latent continuum measured by the scale. |
Property | Definitions and Acceptability Criteria |
---|---|
Data quality | Data quality refers to the extent to which the scale items are accepted by the participants and, consequently, yield usable responses. Missing data are indicative of a lack of acceptability and/or a lack of applicability of the items from the perspective of the participant. Item-level missing data should be < 10% [58] |
Scaling assumptions | Scaling assumptions refer to the extent to which it is legitimate to sum a set of item scores, without weighting or standardisation, to produce a single total score [59, 60]. Summing scale item scores is considered legitimate, when the items: • are approximately parallel (i.e., they measure at the same point on the scale). This criterion is satisfied when items have similar mean scores [61]; • contribute similarly to the variation of the total score (i.e., they have similar variances), otherwise they should be standardized. This criterion is satisfied when items have similar standard deviations [62]; |
Scale-to-sample targeting | Scale-to-sample targeting refers to the extent to which the range of the construct measured by the scale matches the range of that variable in the study sample. Adequate targeting provides greater confidence in making judgments about the performance of the scale when interpreting results. Poor targeting implies that measurement precision is limited. People with extreme scores represent a sub-sample in which changes within and differences between individuals will be underestimated. Scale scores should span the entire range; floor (proportion of the sample at the minimum score for the scale) and ceiling (proportion of the sample at the maximum score) effects should be low (< 15%) [65]; and skewness, i.e., the third central moment of the distribution capturing its asymmetry, should be between ±1 [66]. There are no published criteria for item-level targeting. |
Reliability | Reliability refers to the extent to which scale scores reflect random error. High reliability indicates that scores are associated with little random error, i.e., are consistent. Internal consistency reliability estimates the random error associated with total scores from the intercorrelations among the items [67]. The recommended level for adequate scale internal consistency is Cronbach’s alpha coefficient ≥ 0.80 [67], and item-total correlations > 0.30 [58]. |
Results
Participants
Participant status | Survey 1 n (%) | Survey 2 n (%) |
---|---|---|
Accessed the survey | 11,914 | 14,904 |
Enrolled in the survey | 2411 | 2400 |
Completed the survey | 2020 | 1640 |
Dropped out during the survey | 391 | 760 |
Not enrolled because of inclusion/exclusion criteria violation | 2512 | 2764 |
Not enrolled because of full quota | 3082 | 4312 |
Scale formation and item reduction (Survey 1)
Proposed Scale (# items) | % coverage item threshold distribution | % items with fit residual > | 2.5 |a | % items with p (χ2) < 0.05 b | % items with disordered thresholds | % pairs of item residual correlations > 0.30 | % pairs of item residual correlations > mean + 0.30c | % items with p (DIF) < 0.05b | PSI |
---|---|---|---|---|---|---|---|---|
Survey 1 Long Form Scales | ||||||||
Health Risk (34) | 88 | 94 | 21 | 0 | 16/595 | 24/595 | 50 | 0.97 |
Addiction Risk (11) | 80 | 82 | 18 | 0 | 3/49 | 4/49 | 9 | 0.94 |
Survey 1 Reduced Scales | ||||||||
Health Risk (18) | 84 | 61 | 0 | 0 | 0/153 | 13/153 | 0 | 0.97 |
Addiction Risk (7) | 75 | 86 | 0 | 0 | 0/18 | 2/18 | 0 | 0.93 |
Survey 2 Reduced Scales | ||||||||
Health Risk (18) | 87 | 72 | 0 | 0 | 0/153 | 8/153 | 0 | 0.97 |
Addiction Risk (7) | 78 | 86 | 0 | 0 | 0/18 | 1/18 | 0 | 0.94 |
Domain, item (abbreviated)a | Item location | Standard error | χ2 (df = 9) | p (χ2)b |
---|---|---|---|---|
PRI Perceived Health Risk | ||||
Cough lasting for days | 0.150 | 0.021 | 4.612 | 0.867 |
Gum health | 0.035 | 0.022 | 2.275 | 0.986 |
Lung cancer | − 0.477 | 0.021 | 7.998 | 0.534 |
Wheezing | −0.193 | 0.021 | 1.421 | 0.998 |
Mouth throat cancer | −0.058 | 0.022 | 0.931 | 1.000 |
Aging faster | −0.015 | 0.021 | 0.445 | 1.000 |
Minor illnesses | 0.176 | 0.022 | 1.968 | 0.992 |
Respiratory infection | −0.051 | 0.022 | 5.752 | 0.764 |
Serious illness | 0.049 | 0.022 | 4.425 | 0.881 |
Reduced stamina | 0.135 | 0.022 | 2.138 | 0.989 |
Emphysema | −0.132 | 0.021 | 3.447 | 0.944 |
Cough in the morning | 0.045 | 0.021 | 2.879 | 0.969 |
Sense of taste | −0.288 | 0.022 | 3.543 | 0.939 |
Heart disease | −0.147 | 0.021 | 0.817 | 1.000 |
Earlier death | 0.426 | 0.022 | 5.717 | 0.768 |
Sores mouth throat | 0.319 | 0.022 | 4.140 | 0.902 |
Unfit | 0.001 | 0.022 | 0.824 | 1.000 |
Other cancer | 0.150 | 0.021 | 4.612 | 0.867 |
PRI Perceived Addiction Risk | ||||
Being unable quit | 0.428 | 0.028 | 6.203 | 0.719 |
Feeling addicted | −0.133 | 0.025 | 6.343 | 0.705 |
To feel better | 0.311 | 0.026 | 2.750 | 0.973 |
Feeling like have to smoke | 0.105 | 0.026 | 4.742 | 0.856 |
Cannot stop | 0.230 | 0.028 | 3.665 | 0.932 |
Feeling unable quit | 0.097 | 0.028 | 2.853 | 0.970 |
Anxiety situation people smoke | −1.038 | 0.054 | 10.612 | 0.303 |
Proposed Scale (# items) | Range don’t know responses (%) | Min-Max Sum score | Mean Sum score (SD) | Range CITC | Ceiling/ Floor (%) | Skewness | Cronbach’s alpha | Mean IIC | Range IIC |
---|---|---|---|---|---|---|---|---|---|
Survey 1 | |||||||||
Health Risk (18) | 11–15 | 18–90 | 54.4 (22.32) | 0.89–0.93 | 7/10 | 0.05 | 0.99 | 0.83 | 0.76–0.90 |
Addiction Risk (7) | 8–12 | 6–30 | 20.7 (7.50) | 0.90–0.93 | 8/20 | −0.41 | 0.98 | 0.87 | 0.82–0.91 |
Survey 2 | |||||||||
Health Risk (18) | 12–14 | 18–90 | 56.1 (20.46) | 0.88–0.92 | 5/10 | 0.02 | 0.99 | 0.81 | 0.75–0.89 |
Addiction Risk (7) | 8–13 | 6–30 | 20.6 (7.09) | 0.92–0.95 | 6/18 | −0.32 | 0.98 | 0.89 | 0.85–0.93 |
Psychometric cross-validation (Survey 2)
Construct validity (Survey 2)
Scale | CC rs (n) | THS 2.2 rs (n) | E-CIG rs (n) | NRT rs (n) |
---|---|---|---|---|
PRI-P vs. VAS Health Risk | 0.58 (765) | 0.65 (651) | 0.65 (717) | 0.54 (550) |
PRI-P vs. VAS Addiction Risk | 0.56 (767) | 0.67 (704) | 0.68 (708) | 0.57 (534) |
PRI-G vs. VAS Health Risk | 0.52 (775) | 0.61 (711) | 0.62 (724) | 0.52 (713) |
PRI-G vs. VAS Addiction Risk | 0.54 (771) | 0.59 (702) | 0.61 (714) | 0.52 (704) |
PRI-P Health Risk Scale | PRI-G Health Risk Scale | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Short and Long-Term Risk Questionnaire | All (n = 773) | NS (n = 184) | FS (n = 192) | CS IQ (n = 203) | CS NIQ (n = 194) | All(n = 778) | NS(n = 192) | FS (n = 196) | CS IQ(n = 197) | CS NIQ(n = 193) |
Item 1 | −0.35 | − 0.26 | −0.40 | − 0.21 | − 0.21 | −0.30 | − 0.29 | −0.29 | − 0.20 | −0.33 |
Item 2 | 0.33 | 0.34 | 0.28 | 0.24 | 0.35 | 0.39 | 0.26 | 0.45 | 0.31 | 0.45 |
Item 3 | −0.28 | −0.27 | −0.34 | − 0.14 | −0.14 | −0.29 | − 0.26 | −0.24 | − 0.23 | −0.25 |
Item 4 | −0.28 | −0.30 | − 0.37 | −0.10 | − 0.13 | − 0.28 | −0.27 | − 0.29 | −0.24 | − 0.23 |
Item 5 | 0.30 | 0.18 | 0.18 | 0.28 | 0.37 | 0.41 | 0.29 | 0.39 | 0.36 | 0.46 |
Instrument: Type of Risk Domain | Object | Rasch-Based (logits) |
---|---|---|
Mean (SD) | ||
PRI-P: Personal Perceived Health Risk | CC (n = 773) | 2.12 (3.19) |
THS 2.2 (n = 718) | 0.51 (3.17) | |
E-CIG (n = 726) | −0.15 (3.36) | |
NRT (n = 556) | −1.47 (3.15) | |
CESS (n = 586) | −0.69 (2.86) | |
PRI-P: Personal Perceived Addiction Risk | CC (n = 770) | 2.91 (3.51) |
THS 2.2 (n = 706) | 1.23 (3.66) | |
E-CIG (n = 712) | 0.61 (3.88) | |
NRT (n = 537) | −0.30 (3.62) | |
CESS, towards CC (n = 583) | −0.89 (3.60) | |
PRI-G: General Perceived Health Risk | CC (n = 778) | 2.51 (2.88) |
THS 2.2 (n = 716) | 0.63 (2.97) | |
E-CIG (n = 728) | −0.17 (3.06) | |
NRT (n = 718) | −0.70 (3.12) | |
CESS (n = 767) | 0.07 (2.83) | |
PRI-G: General Perceived Addiction Risk | CC (n = 773) | 3.73 (3.06) |
THS 2.2 (n = 703) | 1.69 (3.46) | |
E-CIG (n = 715) | 0.75 (3.40) | |
NRT (n = 705) | 0.30 (3.29) | |
CESS, towards CC (n = 753) | −0.04 (3.32) |
Instrument | Smoking Status Group |
n
| Mean (logits) | SD | t (df) | p-value | Cohen’s d |
---|---|---|---|---|---|---|---|
Differences between personal and general risk | |||||||
PRI-P | CS (all) | 397 | 1.26 | 2.88 | 2.50 (785) | 0.013 | 0.18 |
PRI-G | CS (all) | 390 | 1.77 | 2.88 | |||
PRI-P | CS NIQ | 194 | 0.93 | 2.96 | 1.21 (385) | 0.227 | – |
PRI-G | CS NIQ | 193 | 1.29 | 2.93 | |||
PRI-P | CS IQ | 203 | 1.58 | 2.76 | 2.42 (398) | 0.016 | 0.24 |
PRI-G | CS IQ | 197 | 2.25 | 2.76 | |||
Differences between current smokers and never smokers | |||||||
PRI-P | CS (all) | 397 | 1.26 | 2.88 | 6.28 (579) | <.001 | 0.53 |
NS | 184 | 3.05 | 3.80 | ||||
CS NIQ | 194 | 0.93 | 2.96 | 6.08 (376) | <.001 | 0.62 | |
NS | 184 | 3.05 | 3.80 | ||||
CS IQ | 203 | 1.58 | 2.76 | 4.39 (385) | <.001 | 0.44 | |
NS | 184 | 3.05 | 3.80 | ||||
PRI-G | CS (all) | 390 | 1.77 | 2.88 | 7.53 (580) | <.001 | 0.68 |
NS | 192 | 3.65 | 2.69 | ||||
CS NIQ | 193 | 1.29 | 2.93 | 8.22 (383) | <.001 | 0.84 | |
NS | 192 | 3.65 | 2.69 | ||||
CS IQ | 197 | 2.25 | 2.76 | 5.06 (387) | <.001 | 0.51 | |
NS | 192 | 3.65 | 2.69 | ||||
Differences between CS IQ and CS NIQ | |||||||
PRI-P | CS IQ | 203 | 1.58 | 2.76 | 2.28 (395) | 0.023 | 0.23 |
CS NIQ | 194 | 0.93 | 2.96 | ||||
PRI-G | CS IQ | 197 | 2.25 | 2.76 | 3.33 (388) | 0.001 | 0.34 |
CS NIQ | 193 | 1.29 | 2.93 |
Carry-over effects (Survey 2)
Sequence |
n
| Mean (logit) | SD | t (df) | p-value | Cohen’s d |
---|---|---|---|---|---|---|
PRI-P | ||||||
CC first CC subsequently | 159 614 | 2.08 2.13 | 2.98 3.24 | 0.18 (771) | 0.860 | – |
THS 2.2 first THS 2.2 subsequently | 149 569 | 0.62 0.48 | 3.19 3.17 | −0.45 (716) | 0.650 | – |
E-CIG first | 142 | −0.25 | 3.42 | 0.39 (724) | 0.696 | – |
E-CIG subsequently | 584 | −0.12 | 3.34 | |||
NRT first | 110 | −1.35 | 2.85 | −0.42 (554) | 0.672 | – |
NRT subsequently | 446 | −1.49 | 3.22 | |||
CESS first | 115 | −0.05 | 2.52 | −2.66 (584) | 0.008 | 0.29 |
CESS subsequently | 471 | −0.84 | 2.91 | |||
PRI-G | ||||||
CC first | 162 | 2.89 | 2.75 | −1.89 (776) | 0.060 | – |
CC subsequently | 616 | 2.41 | 2.91 | |||
THS 2.2 first | 149 | 0.50 | 2.97 | 0.62 (714) | 0.537 | – |
THS 2.2 subsequently | 567 | 0.66 | 2.97 | |||
E-CIG first | 143 | −0.09 | 3.21 | −0.35 (726) | 0.723 | – |
E-CIG subsequently | 585 | −0.19 | 3.03 | |||
NRT first | 140 | −0.21 | 2.85 | −2.10 (716) | 0.037 | 0.20 |
NRT subsequently | 578 | −0.82 | 3.17 | |||
CESS first | 156 | 0.95 | 2.76 | −4.41 (765) | < 0.001 | 0.40 |
CESS subsequently | 611 | −0.15 | 2.80 |