Questionnaire data from study participants were linked probabilistically to data from the NSW Register of Births, Deaths and Marriages up to 30 June 2012 to provide data on fact and date of death. This probabilistic matching is known to be highly accurate (false-positive and false-negative rates <0.4%) [
13]. Death registrations capture all deaths in NSW. Cause of death information was not available at the time of analysis. In order to conduct sensitivity analyses, questionnaire data were also linked probabilistically to data from the NSW Admitted Patient Data Collection, which is a complete census of all public and private hospital admissions in NSW. The linked data that were used contained details of admissions in participants from the year 2000 up to the point of recruitment, including the primary reason for admission using the International Classification of Diseases 10
th revision – Australian Modification (ICD-10-AM) [
14] and up to 54 additional clinical diagnoses.
Statistical methods
There were 266,777 participants with valid data on age and date of recruitment. Participants with data linkage errors (n = 20, 0.01%), age below 45 years at baseline (n = 3, 0.001%), and missing or invalid data on smoking status (n = 860, 3%) were excluded. To minimise the potential impact of changes in smoking behaviour and higher mortality in those with baseline illness (also known as reverse causality or the “sick quitter” effect), participants with a self-reported history of doctor-diagnosed cancer other than melanoma and/or non-melanoma skin cancer (n = 30,393, 11%) and those with a history of cardiovascular disease at baseline, defined as self-reported doctor-diagnosed heart disease, stroke, or blood clot on the baseline questionnaire (n = 30,548, 11%) were excluded from this study. It was not possible to exclude all individuals with respiratory illness because this information was not available in an appropriate form from the baseline questionnaire. However, sensitivity analyses were conducted to investigate the impact on the main results of additional exclusion of individuals with a history of admission to hospital with chronic obstructive pulmonary disease or other respiratory illnesses (defined as an admission to hospital with ICD-10-AM diagnosis codes J40 to J44 and J47 in any of the 55 diagnostic fields) in the 6 years prior to completing the baseline 45 and Up Study questionnaire.
Smoking status was classified according to the responses to the following series of items on the baseline questionnaire: “Have you ever been a regular smoker? If “Yes”, how old were you when you started smoking regularly? Are you a smoker now? If not, how old were you when you stopped smoking regularly? About how much do you/did you smoke on average each day?” Never-smokers were participants who answered “No” to the question, “Have you ever been a regular smoker?”; current smokers were those who answered “Yes” to this question and “Yes” to being a smoker now; and past smokers were those who indicated that they had ever been a regular smoker but who indicated that they were not a smoker now. The age at ceasing smoking, among past smokers, was taken as the age they indicated they stopped smoking regularly and was categorised as <25, 25–34, 35–44, 45–54, and ≥55 years. Among current and past smokers, the number of cigarettes smoked per day was taken from the answer to the question about how much they smoked on average each day and was categorised as ≤14, 15–24, and ≥25 cigarettes/day.
Mortality rates since baseline and 95% confidence intervals (CIs) were calculated for participants who reported being current, past, and never-smokers at baseline; these were indirectly standardised for age to the person-year distribution of the whole cohort population [
15], and were presented separately for men and women. Hazard ratios (which are equivalent to, and described here as relative risks [RRs]) for mortality in men and women were estimated separately for men and women and according to birth cohorts with sufficient amounts of data, using Cox regression modelling, in which the underlying time variable was age. Estimates are shown initially accounting for age only (automatically adjusted for as the underlying time variable). Models are then presented adjusted for additional covariates derived from baseline questionnaire and participant location data, including education (<secondary school, secondary school graduation, trade/apprenticeship/certificate/diploma, university graduate); annual pre-tax household income (AUD <$20,000, $20,000–$39,999, $40,000–$69,999, ≥$70,000); region of residence (major cities, inner regional areas, outer regional/remote areas); alcohol consumption (0, 1–14, ≥15 alcoholic drinks/week), and body mass index (BMI) (<20, 20–24.99, 25–29.99, ≥30 kg/m
2). Missing values for covariates other than smoking status were included in the models as separate categories. Hypertension and dyslipidaemia were considered likely to be part of the causal pathway between smoking and mortality and were not adjusted for. Sensitivity analyses were conducted: i) adjusting additionally for physical activity; and ii) categorising current smokers as those who reported being current smokers at baseline and past smokers who had ceased smoking 3 or fewer years prior to baseline.
Among current and never-smokers at recruitment, mortality rates and RRs by amount smoked were calculated according to categories of consumption reported at recruitment (≤14, 15–24, and ≥25 cigarettes/day). Mortality rates were then plotted against the mean number of cigarettes within each category reported at the 3-year resurvey among those who reported being current smokers at resurvey, as this was considered the best estimate of long-term mean consumption among all in that category, before the study started (Additional file
1: Table S1). Rates in never-smokers were plotted against the “0” on the x-axis. The RR of dying during the follow-up period was then quantified among past versus never-smokers, in those ceasing smoking at ages <25, 25–34, 35–44, and 45–54 years. Sensitivity analyses were conducted restricting the data to individuals aged ≥55 years, ensuring that all participants had the opportunity to quit at these ages.
The proportionality assumption of the Cox regression models was verified by plotting the Schoenfeld residuals against the time variable in each model, with a stratified form or time-dependent form of the model used where covariates displayed non-proportionality of hazards. No violations of the proportionality assumption were detected for the main exposure. Minor violations were observed in covariates for certain models and a stratified Cox model was fitted, as follows: overall analyses of current and past versus never-smokers – model stratified by education; analyses relating to birth decade – model stratified by alcohol, education, and income; analyses relating to number of cigarettes smoked per day – model stratified by income; analyses relating to age at smoking cessation – model stratified by alcohol and education.
Separately for males and females, absolute mortality rates for Australian smokers and non-smokers for age group
i (45–54, 55–64, and 65–74 years) were estimated by M
i/(1 + (RR − 1)P
i
) for non-smokers and RR times this for smokers [
16] (where M
i and
\( {\mathrm{P}}_i \) represent 2010/2011 Australian population mortality rates and smoking prevalence estimated from other sources, respectively [
17,
18], and RR represents all-cause current smoker versus never-smoker RRs estimated in the current study). From these rates, cumulative risks of death for non-smokers and smokers at age
x (55, 65, or 75 years) from age 45 were estimated by
\( 1 - \exp \Big(-10{\displaystyle {\sum}_{\mathrm{i}=\left(45-54\right)}^{\mathrm{x}}{\mathrm{MR}}_{\mathrm{i}}\Big)} \) (where MR
i is either the smoker or non-smoker mortality rate for age group
i) [
19].
All statistical tests were two-sided, using a significance level of 5%. Analyses were carried out using SAS® version 9.3 [
20] and Stata® versions 11 and 13.
Ethical approval for the 45 and Up Study as a whole was provided by the University of New South Wales Human Research Ethics Committee and specifically for this study by the NSW Population and Health Services Research Ethics Committee and the Australian National University Human Research Ethics Committee.