Arsenic levels in drinking water
Data on the arsenic levels in drinking water were obtained from a nation-wide survey conducted by the Taiwan Provincial Institute of Environmental Sanitation [
20] using the standard mercuric bromide stain method [
21]. According to the standard solutions used in that survey, drinking water arsenic levels can be grouped into 10 categories: "undetectable" (test result compatible with the blank control) "trace" (test result between the blank control and the 0.01 mg/L standard), "0.01 mg/L," "0.02 mg/L," "0.03-0.04 mg/L," "0.05-0.08 mg/L," "0.09-0.16 mg/L," "0.17-0.32 mg/L," "0.33-0.64 mg/L," and " > 0.64 mg/L." While Taiwanese laboratories applied this method, the limit of detection (LOD; defined as the value of the mean plus three times of the standard deviation obtained from repetitive testing of blank controls) was 0.04 mg/L [
22]. Therefore, in the data analyses, all levels at or below the LOD were combined together as a single category " < 0.05 mg/L."
The original survey data are available for 65269 wells in 243 townships, with an average of about 269 wells in each township. Because the survey was specifically for the arsenic level in drinking water, the standard report form did not include other chemical characteristics of the well water. As in most of the similar ecological studies, the number of users of each well was not recorded. The survey performed almost all the measurements between 1974 and 1976. At the time of survey bottled water was generally unavailable, and therefore it can be assumed that most quantity of drinking water was taken from the same source all days in the surveyed areas.
Data on the number of residents by gender and age in each township were obtained from the Department of Internal Affairs to estimate the number of people in each unit population. The 243 townships had a total of about 11.7 million residents, about half of the population in Taiwan, and Table
1 presents the estimated number of people in each exposure category. About 48000 people drank water in the highest arsenic category (" > 0.64 mg/L"), which was the smallest population in an exposure category.
Table 1
Distribution of arsenic exposure levels in well water
Average % of wells in each township | 91.0 | 2.8 | 3.0 | 1.7 | 1.0 | 0.6 |
Estimated populationa
| | | | | | |
Males | 5 605 000 | 169 000 | 159 000 | 84 000 | 47 000 | 25 000 |
Females | 5 144 000 | 156 000 | 147 000 | 77 000 | 43 000 | 23 000 |
Total | 10 749 000 | 326 000 | 305 000 | 162 000 | 90 000 | 48 000 |
Collection of other data
Cases of bladder cancer diagnosed between January 1, 1980 and December 31, 1989 were identified using the computerized database of the National Cancer Registry Program, which is operated by the Department of Health. Gender, age, diagnoses, and township of residence were reported for each registered case. Cases with ICD-O codes [
23] from 188.0 to 188.9 were defined as bladder cancer cases, and 3068 cases, including 2276 men and 792 women, were identified.
Demographic data on the residents in each township at the end of 1985, the midpoint of the ten-year study period, were obtained from the Department of Internal Affairs. The numbers of residents of seven age groups were calculated: 0-19 years, 20-29 years, 30-39 years, 40-49 years, 50-59 years, 60-69 years, and above 69 years.
An urbanization index developed by Wu [
24] on the basis of 19 socioeconomic factors was adopted to assess the associations between urbanization and incidence of bladder cancer. The study townships had urbanization indexes ranging from -1.410 to 3.257 (mean = 0.224, standard deviation = 1.128).
The magnitude of cigarette sales was used to evaluate effects of smoking. In Taiwan, cigarette selling was a monopoly business operated by the Tobacco and Alcohol Monopoly Bureau during the study period. Sales records collected from the Bureau in a previous study [
7] were adopted to estimate the number of cigarettes sold per capita per year in each township, which had a range of 14.94 to 689.93 (mean = 63.76, standard deviation = 66.11). The unit for of cigarette sales used in the analyses was 100 cigarettes.
Data analysis
For comparison, three different methods were applied to analyze the data, but they required different information. To account for the fact that the size of the population was different across the townships, in all three approaches, the population in each township was used as the weighting factor in regression models.
The first approach, referred to as "Direct Method," applies the direct standardization procedure. For each township, gender-age specific cumulative incidence rates over the ten-year period were calculated, and then a standardized incidence rate (SIRate) can be obtained by adopting the age distribution of the world standard population in 1976 [
25] as the following:
where W
i is the number of people in the
ith age group in the standard population, and IR
i is the age-specific average annual cumulative incidence rate of the
ith age group. The unit for IR
i was cases per 100,000. Then, the risk associated with each exposure level can be estimated through the following regression model:
(1)
where for each township, Xj is the proportion (as percentage) of residents with arsenic exposures in category j, U is the urbanization index, and T is the number of cigarettes (in hundreds) sold per capita. Because the exposure category " < 0.05 mg/L" was used as the reference, X1 = percentage of residents in the "0.05-0.08 mg/L" category, X2 = percentage of residents in the "0.09-0.16 mg/L" category, and so on. In this case, α (intercept) is the estimated background cumulative incidence rate, βj indicates the rate difference (RD) associated with each 1% increase in residents in category j, γ indicates the RD associated with each one-unit increase in urbanization index, and δ indicates the RD associated with each 100 cigarettes sold per capita.
The second approach, referred to as "Indirect Method," applies the indirect standardization procedure, which adopts the age-specific incidence rates in a reference population and obtains the expected number of cases in the
ith age group (E
i) in a given township as the following:
where P
i is the number of people in the
ith age group in the township, and RIR
i is the age-specific cumulative incidence rate of the
ith age group in the reference population. The total population of the 243 townships combined was used as the reference population in the analysis. A standardized incidence ratio (SIRatio) for each township can thus be obtained as the following:
where O
i is the observed number of cases in the
ith age group in the unit population. Then, the risk associated with each exposure level can be estimated through the following regression model:
(2)
where Xj, U, and T are defined as in Model 1. In this case, α' is the estimated background ratio, βj' indicates the increase in SIRatio associated with each 1% increase in residents in category j, γ' indicates the increase in SIRatio associated with each one-unit increase in urbanization index, and δ' indicates the increase in SIRatio associated with each 100 cigarettes sold per capita. In this model, SIRatio (a rate ratio) needs to be forced to take the value 1 when the arsenic exposure is within the reference category (" < 0.05 mg/L") and all other variables are set to their reference categories. This can be accomplished through coding the " < 0.05 mg/L" group as the reference category.
The third approach, referred to as "Variable Method," treats age as a predictor of bladder cancer and adds independent variables in the regression models to evaluate and adjust for the effects of age as the following:
(3)
where CIR is the crude cumulative incidence rate, X
j, U, and T are defined as in Model 1, and A
k is the proportion (as percentage) of residents in age group
k in each township. Because there are seven age groups, six independent variables derived from dummy variables at the individual level were used in the regression model [
5]. Therefore, the age group "0-19 years" was used as the reference, A
1 = percentage of residents in the age group "20-29 years," A
2 = percentage of residents in the age group "30-39 years," and so on. In this case, α" is the estimated background cumulative incidence rate; β
j", γ", and δ" are defined as β
j, γ, and δ in Model 1 respectively; and θ
k indicates the RD associated with each 1% increase in residents in age group
k.
Models 1 and 3 generate estimates of RD's, but Model 2 generates estimates of incremental rate ratios. Therefore, estimates of RD's from Models 1 and 3 were then divided by the estimates of background rates (α and α" respectively) to obtain estimates of incremental rate ratios to facilitate the comparison among the three methods.