Data
Data on 165,372 respondents from 2000 to 2008 in Ontario, Canada were collected in the Canadian Community Health Surveys (CCHS) (cycles 1.1, 2.1, 3.1, 2007, and 2008). The CCHS is a repeated cross-sectional survey that collects information related to health status, health behaviours (including smoking), community-oriented health determinants and health care utilization for the Canadian population. The first cycle of CCHS started in 2000 and the data were collected for both 2000 and 2001. The second cycle data were collected in 2003 while the third cycle data were collected in 2005. The surveys after 2006 were conducted yearly.
In Ontario, about half of the sample respondents were selected from an area frame and the other half from a list frame of telephone numbers. A stratified two-stage design established for the Canadian Labour Force Survey (LFS) was used for the area frame, while a random sampling process was used given a telephone list in each health region. A full description of the sampling methods is available online at Statistics Canada’s website [
28]. Based on this sampling design, although samples are not uniformly distributed among small areal units (smaller than health regions), almost all the census sub-divisions (CSDs) contain enough respondents for the estimation of smoking prevalence at this level. Since CSDs are deemed to be equivalent to municipalities of Canada, the data provide an important opportunity to examine the spatial and temporal patterns and determinants of smoking prevalence among municipalities.
Respondents’ ages in the collected CCHS data range from 12 to 102. Smokers were defined as individuals who had smoked more than 100 cigarettes in their lifetimes, and smoked at least once in the previous 30 days. In addition to smoking status, the data contain age, gender, socio-demographic factors, psycho-social factors, policy related variables, geographical locations, and geographical identifiers (postal codes). Variables used in the current analysis were described in Table
1. Since this is a secondary analysis of Statistics Canada data, no ethics clearance is required by the Office of Research Ethics at the University of Waterloo. All security procedures required by Statistics Canada to access and use the data for analysis were followed.
Table 1
Variable description
Response variable:
|
Smoking status | Defined as 1- individuals who had smoked more than 100 cigarettes in their lifetimes, and smoked at least once in the previous 30 days; and 0 – otherwise. |
Successful cessation | Defined as 1 - smokers who successfully quit in the last year and more than a year ago; and 0 – otherwise. |
Exposure variables:
|
AGE
| Age at the time of survey |
SEX
| 1 – female; 2 – male |
Marital status (MS) | 1 – married or common law; 0 – otherwise |
Family income (INCOME) | Standardized household income with 0 mean and 1 variance. |
Unemployment (UNEMPLOY) | 1 - Full-time or patricianly employed; 0 – otherwise. |
Low education (LOWEDU) | 1- High school or lower; 0- otherwise. |
Perceived life stress (PLS) | Perceived life stress: 1- Not at all stressful, 2- not very stressful, 3 - a bit stressful, 4 - quite a bit stressful, and 5 - extremely stressful. |
Sense of belonging to communities (SBC) | 1- very strong , 2- somewhat strong, 3- somewhat weak, and 4 - very weak. |
Complete workplace smoking restrictions (SMKRWC) | 1- completely restricted smoking restrictions at place of workplace; 0- otherwise. |
Partial work place smoking restrictions (SMKRWP) | 1- allowed in designated areas or restricted only in certain places; 0 – otherwise. |
Home smoking restrictions (HOME_RESTRIC) | Restrictions against smoking cigarettes in home: 1 - Yes 1, 0 – no. |
Geographic locations (GEO) | Defined as 1- Greater Toronto Area (GTA); 2- any other urban areas; and 3- rural area. |
YEAR
| Survey year: 0 – 2000; 1 – 2001; … ; 8 – 2008; |
Temporal and spatio-temporal analyses
To analyze the seemingly downward overall time trend of smoking prevalence in Ontario and potential affecting factors, multi-level temporal models were constructed and fitted using the SAS v9.2 GLIMMIX procedure. Since adults and youth smoking behaviours may be affected by different risk factors, adult (age 19 and over, including 147,118 respondents) and youth (age 12 – 18, including 18,254 respondents) populations were analyzed separately. Assuming that the time trend of smoking prevalence is not linear over the years, the full temporal models are defined as follows.
For adult
i in census subdivision
j:
$$ \begin{array}{l} Adult\ smoking\ status \sim binary\ \left({p}_{ij}\right)\\ {} Level\ 1\ \left( person\ level\right):\ logit\left({p}_{ij}\right) = {\beta}_{0j} + {\beta}_1AG{E}_{ij} + {\beta}_2SE{X}_{ij} + {\beta}_3M{S}_{ij} + {\beta}_4 INCOM{E}_{ij} + {\beta}_5 UNEMPLO{Y}_{ij} + {\beta}_6\\ {}\kern1em LOWED{U}_{ij} + {\beta}_7 PL{S}_{ij} + {\beta}_8SB{C}_{ij} + {\beta}_9 SMKRW{C}_{ij} + {\beta}_{10} SMKRW{P}_{ij} + {\beta}_{11} HOME\_ RESTRI{C}_{ij} + {\beta}_{12}GEO+\\ {}\kern1em {\beta}_{13}YEA{R}_{ij} + {\beta}_{14}YEA{R_{ij}}^2 + {\beta}_{15}YEA{R}_{ij}* HOME\_ RESTRI{C}_{ij}+{\beta}_{16}YEA{R_{ij}}^2* HOME\_ RESTRI{C}_{ij}\\ {} Level\ 2\ \left( Census\ subdivision\ level\right):\kern0.5em {\beta}_{0j} = {\gamma}_0 + {v}_{0j}\end{array} $$
(1)
For youth
i in census subdivision
j:
$$ \begin{array}{l} Youth\ smoking\ status \sim binary\ \left({p}_{ij}\right)\\ {} Level\ 1\ \left( person\ level\right):\ logit\left({p}_{ij}\right) = {\beta}_{0j} + {\beta}_1AG{E}_{ij} + {\beta}_2SE{X}_{ij} + {\beta}_3 INCOM{E}_{ij}+{\beta}_4 PL{S}_{ij}+{\beta}_5SB{C}_{ij}+{\beta}_6 HOME\_ RESTRI C\\ {}{}_{ij} + {\beta}_7GEO + {\beta}_8YEA{R}_{ij} + {\beta}_9YEA{R_{ij}}^2 + {\beta}_{10}YEA{R}_{ij}* HOME\_ RESTRI{C}_{ij}+{\beta}_{11}YEA{R_{ij}}^2* HOME\_ RESTRI{C}_{ij}\\ {} Level\ 2\ \left( Census\ subdivision\ level\right):\kern0.5em {\beta}_{0j} = {\gamma}_0 + {v}_{0j}\end{array} $$
(2)
where smoking status has a binary distribution. The log odds of smoking probabilities are regressed to year, and year squared. For adults, the model at level 1 (individual level) also includes age, sex, marital status (MS), family income (INCOME), unemployment (UNEMPLOY), low education (LOWEDU), perceived life stress (PLS), sense of belonging to communities (SBC), complete and partial work place smoking restrictions (SMKRWC and SMKRWP), home smoking restrictions (HOME_RESTRIC), and geographic locations (GEO). The GEO variable is included to control for any variations of smoking prevalence between large urban (the Greater Toronto Area), other urban and rural areas. For youth, the model includes age, sex, family income, PLS, SBC, home smoking restriction (HOME_RESTRIC), and GEO. Assuming that smoking prevalence is different among municipalities, a random intercept was constructed at the census subdivision level with a fixed average effect γ
0
, and a random effect v
0j
, which has a normal distribution with a mean of 0.
The time trend was tested by incrementally adding explanatory variables in the above models. The overall time trend was first tested by adding in only the time variables and controlling for age and sex (Model 1). The socio-demographic, socio-economic (SES), psycho-social, and workplace smoking restriction variables were then added to the model to test whether or not these variables may have potential impacts on the time trend (Model 2). The variable of home smoking restrictions was further added (Model 3), followed by adding in the interaction terms of time and home smoking restriction (Model 4) to test the potential impact of home smoking restriction on the time trends. Since only smokers were asked the question on home smoking restrictions in the 2000 and 2001 surveys and all respondents were asked the same question in 2003–2008 surveys, the above models were fitted using the 2003–2008 data only, which include 112,848 adult and 13,863 youth respondents.
To test how spatial dependencies are modeled and whether or not there are remaining spatial autocorrelations, spatial dependencies at the area level were also calculated using the global Moran’s I [
29] on the CSD-level residual,
v
0j
, after Equations (
1) and (
2) were fitted.
Previous research suggests that the extent of home smoking restrictions is one of the most powerful determinants of cessation [
21] and may therefore be an important predictor for smoking reduction. To test the association between smoking restriction and adult smoking cessation, a model similar to that of Equation (
1) was also constructed with the variable of successful cessation as the outcome and year variables removed.
Based on the results of the above analysis, the distributions of smoking prevalence among municipalities and the changes of these patterns over time were further constructed and tested using multi-level spatial temporal modeling (WinBUGS 1.4.3) [
30]. The models for adult and youth were constructed as follows.
ADULT:
$$ Smoking\ status \sim binary\ \left({p}_{ij}\right) $$
Level 1 (PERSON LEVEL):
$$ \begin{array}{l} logit\left({p}_{ij}\right) = {\beta}_{0j} + {\beta}_1AG{E}_{ij} + {\beta}_2SE{X}_{ij} + {\beta}_3M{S}_{ij} + {\beta}_4 INCOM{E}_{ij} + {\beta}_5 UNEMPLOYMEN{T}_{ij} + {\beta}_6 LOWED{U}_{ij} + {\beta}_7 PL{S}_{ij}\\ {} + {\beta}_8SB{C}_{ij} + {\beta}_9 SMKRW{C}_{ij} + {\beta}_{10} SMKRW{P}_{ij} + {\beta}_{11j}YEA{R}_{ij} + {\beta}_{12j} HOME\_ RESTRI{C}_{ij}\end{array} $$
Level 2 (CSD LEVEL):
$$ \begin{array}{l}{\beta}_{0j} = {\gamma}_0 + {v}_{0j} + {u}_{0j}\\ {}{\beta}_{11j} = {\gamma}_1 + {v}_{1j} + {u}_{1j}\\ {}{\beta}_{12j} = {\gamma}_2 + {v}_{2j} + {u}_{2j}\end{array} $$
(3)
YOUTH:
$$ Smoking\ status \sim binary\ \left({p}_{ij}\right) $$
Level 1 (PERSON LEVEL):
$$ logit\left({p}_{ij}\right) = {\beta}_{0j} + {\beta}_1AG{E}_{ij} + {\beta}_2SE{X}_{ij} + {\beta}_3 INCOM{E}_{ij} + {\beta}_4 PL{S}_{ij} + {\beta}_5SB{C}_{ij} + {\beta}_6 HOME\_ RESTRI{C}_{ij} + {\beta}_{7j}YEA{R}_{ij} $$
Level 2 (CSD LEVEL):
$$ \begin{array}{l}{\beta}_{0j} = {\gamma}_0 + {v}_{0j} + {u}_{0j}\\ {}{\beta}_{7j} = {\gamma}_1 + {v}_{1j} + {u}_{1j}\end{array} $$
(4)
The models at level 1 are similar to the corresponding temporal models in Equations (
1) and (
2). Since the time trend after controlling for identified variables was almost linear (see the
Results section), only a single
YEAR variable (rather than
YEAR and
YEAR
2
) is included in Equations (
3) and (
4) for simplicity. The GEO variable is taken out since the effects of geographical locations have already been borne by
u
0j
,
u
1j
and
u
2j
. At the CSD level, based on the results of the above temporal models, it is assumed that smoking prevalence, the time influence, and smoking restrictions at home may vary among municipalities for adults, and smoking prevalence and the time influence may vary among municipalities for youth. The fixed average effects
γ
0
,
γ
1
, and
γ
2
, the uncorrelated random effects
v
0j
,
v
1j
and
v
2j
, and the spatially correlated random effects
u
0j
,
u
1j
and
u
2j
were used for smoking prevalence, the time influence and smoking restriction at home respectively to analyze the municipal-level variations. Given the generally large sizes of municipalities, spatial dependencies likely only exist among adjacent municipalities. Therefore, an intrinsic conditional autoregression (CAR) model with a contiguity neighbourhood structure (assuming only adjacent neighbourhoods are spatially auto-correlated) was used for
u
0j
,
u
1j
and
u
2j
to model the spatial dependencies at the municipal level. After these models were fitted, the spatial variation of smoking prevalence, time influence, and smoking restriction at home can be described using
v
0j
+ u
0j
,
v
1j
+ u
1j
, and
v
2j
+ u
2j
respectively. Since WinBUGS models allow missing data to be treated as stochastic nodes (values to be estimated), all the data obtained from 2000 to 2008 were used to fit the models. The posterior mean values and random effects were used for estimating the spatio-temporal impacts of smoking prevalence.
It can be seen that the spatial and temporal interactions were explicitly measured by the spatially dependent coefficient of the YEAR variable, namely β
11j
for adult and β
7j
for youth. This coefficient allows spatially unequal changes of smoking prevalence over time to be mapped and dramatic changes to be identified.
Since CCHS is a repeated cross-sectional survey, survey weights were also adjusted for the proposed analysis that pools together data from different cycles. The adjusted weight is constructed as follows:
$$ W=WTS\_M* sample\_ size/ sum\_ of\_ sample\_ size s $$
(5)
where WTS_M is the CCHS survey weight, sample_size is the sample size of current cycle, and sum_of_sample_sizes is the sum of sample sizes from all cycles being used for the analysis. This adjustment allows samples from different cycles to be comparable. The adjusted weights were applied to the temporal models (Equations
1 and
2) so that the estimates are representative of the population in the study area. Given the inability of the Bayesian models in WinBUGS to incorporate weights, the weights were not applied to the spatio-temporal models for Equations (
3) and (
4).