Introduction

The National Center for Environmental Health (NCEH) at the Centers for Disease Control and Prevention (CDC) provides data on an increasing number of chemicals in blood and urine for a nationally representative sample of the US population as part of the National Health and Nutrition Examination Survey (NHANES). Recently, Statistics Canada initiated a similar nationally representative survey called the Canadian Health Measures Survey (CHMS) and released data from its first collection cycle (cycle 1, 2007–2009).1, 2 Comparisons of population exposures across countries can be highly informative and can generate hypotheses regarding differences and similarities in exposures from various sources such as air, water, soil, food, and consumer products. To ensure that such comparisons are scientifically meaningful, it is essential to evaluate aspects of the surveys' methods that have the potential to impact data comparability. These include differences in urine and blood collection and handling, analytical approaches, and data analysis as well as differences in population characteristics. For this paper, we analyzed spot urinary bisphenol A (BPA) data from NHANES and CHMS to highlight important methodological issues that should be reviewed before comparing population exposures using these data sets. This is the first time that nationally representative data for Canada have been released, permitting comparison with US population-based data. BPA was selected because it is the subject of scientific and regulatory interest in both countries, there are sufficient data to assess intakes based on urinary concentrations, and there may be sufficient data to begin to examine the effect of temporal changes on our ability to compare intakes between surveys.

To determine whether population estimates differ between surveys, we first describe spot urinary BPA concentrations for Canada (CHMS 2007–2009) and the US (NHANES 2007–2008) and estimate intakes (as dose in units of nanograms per kilogram-day) based on those data. We then explore the possible reasons for the differences between the US and Canadian estimates by examining the survey methodologies. We focus on population characteristics, procedures related to collection and handling of urine samples, analytical methods, and data reporting. As both CHMS and NHANES will continue to generate population-based biomonitoring data for many chemicals, this review is timely and important for researchers seeking to compare data across countries.

Methods

Estimation of Daily BPA Intakes

The method for estimating daily intake for individuals with spot urinary BPA data from NHANES has been described previously3, 4; a similar method was used to estimate daily intakes for the 2007–2008 NHANES (http://www.cdc.gov/nchs/nhanes/nhanes2007-2008/lab07_08.htm) and the 2007–2009 CHMS databases. For each urinary BPA value, the concentration (ng/ml) was combined with an estimated 24-h urinary output volume (ml) to estimate daily BPA excretion (ng/day), which is assumed to be the same as the daily intake. The daily intake for each individual was divided by that individual's body weight to give daily intake adjusted for body weight (ng/kg-day) (Eq. 1). As body weights for the United States are generally higher than those for Canadians,5 intake, which includes an adjustment for body weight, is a more informative measure of comparative exposure than urinary BPA; intakes also allow for comparison with health-based guidance values such as reference doses (RfDs) and tolerable daily intakes (TDIs).

Since 24-h urinary output data were not collected as part of NHANES or CHMS, generic values (given in,3, 4) describing typical urinary output based on age and gender were used to estimate total daily BPA excretion in nanograms. The urinary BPA data were used to represent daily intake because excretion of BPA (parent and metabolite) into urine is essentially complete in 24 h.6 Volume-based urinary BPA data (ng/ml urine), rather than creatinine-adjusted BPA data, were used (the rationale is described in3, 4).

Various researchers have explored the issue of intra-individual variability in urinary BPA measures and while this introduces uncertainty in population estimates of BPA intake, Ye et al.7 concluded “…when the population investigated is sufficiently large and samples are randomly collected relative to meal ingestion times and bladder emptying times, the single spot-sampling approach may adequately reflect the average exposure of the population to BPA.” Further, Mahalingaiah et al.8 stated that ‘‘despite within-person variability in urinary BPA concentrations, a single sample is predictive of long-term exposure (over weeks and months).” LaKind and Naiman3 concluded that NHANES cross-sectional data provide a reasonable reference range for single day exposures and for estimating average population exposures (and therefore intakes).

Distributions of intakes representative of the US and Canadian populations were determined for all participants 6–79 years, by gender and age groups (6–11, 12–19, 20–39, 40–59, and 60–79 years). This method differs slightly from past methodology: NHANES includes individuals over the age of 79, and previous estimates of intakes included these individuals. Since the age cutoff for CHMS is 79 years, the NHANES database was truncated at 79 years for consistency. The total numbers of respondents with all necessary data for estimating daily BPA intake were 2467 for NHANES and 5472 for CHMS.

Calculations of point estimates and confidence intervals (CIs) for the geometric mean intakes and intake percentiles for the US population were carried out in the R platform,9 using the R survey package.10 To estimate various population (and population subgroup) intake quantities such as means or percentiles, weighted means and percentiles were calculated using NHANES 2007–2008 2-year weights provided by CDC.11 CIs for percentiles were calculated using the survey package's implementation of Woodruff's12 method. Analysis of CHMS data was performed using SAS version 9.2 (SAS Institute, Cary, NC, USA, 2003) and SUDAAN version 10 (RTI International, Research Triangle Park, NC, USA, 2008). Geometric means, selected percentiles and corresponding CIs overall and by population subgroups were calculated using the bootstrap technique; the degrees of freedom were specified in the software as degrees of freedom=11 to account for the complex survey design.2 A comparison of the statistical methods used for NHANES and CHMS data was conducted; the two methods yielded the same estimated intakes.

For measures below the limit of detection (LOD), CHMS assigns a value of LOD/2; NHANES assigns a value of LOD/sq rt 2. For consistency between the US and Canada, we elected to assign a value of LOD/2 for measurements below the LOD in both data sets. However, we also evaluated the impact on urinary BPA geometric means with using LOD/2 (CHMS) or LOD/sq rt 2 (NHANES).

Survey Methodology Comparison

Differences in the NHANES and CHMS methods related to population characteristics, urine collection, analytical methods, and data reporting were evaluated using information from the literature, statistical investigations, and inter-laboratory comparisons. Survey method information was derived from NHANES documentation (http://www.cdc.gov/nchs/nhanes/nhanes2007–2008/datadoc_changes_0708.htm) and from CHMS reports.2, 13, 14

As a result of the short physiologic clearance time for BPA (half-life of <2 h;15), if the preponderance of BPA exposure is via the diet, then a longer fasting time should correlate with lower urinary BPA levels. Thus, systematic differences in either fasting times or adherence rates between surveys could hinder the comparison of results. To test this, a correlation test between fasting time and log urinary BPA was conducted by age groups.

To explore the possibility of analytical bias between CDC and INSPQ (the Institut Nationale de Santé Publique du Québec; the laboratory that analyzed samples as part of the CHMS), data from two sets of proficiency testing materials (PTMs) were evaluated. Both laboratories assessed PTMs for urinary BPA from the Arctic Monitoring and Assessment Program INSPQ ring test and the German External Quality Assessment Scheme for Analyses in Biological Materials (G-EQUAS). In each of these programs, registered participants received a set of two to three samples twice a year. Each laboratory returned its analytical results to the program and received a report showing the reliability of its results. Although the proficiency testing took place during 2010–2011, after NHANES 2007–2008 and CHMS 2007–2009 were completed, these data were used as there are no data from actual urine samples with which to assess analytical bias.

Results

Comparison of Urinary BPA and Intakes in the US and Canada

Urinary BPA levels in Canada were statistically significantly lower than in the US for all age/gender groupings (Tables 1 and 2). In general, body weights for Canadians are lower than in the United States (Table 3), so for equivalent urinary BPA levels for US and Canadian individuals, intakes for Canadians would be higher. Despite the lower Canadian body weights, BPA intakes are statistically significantly lower in Canada as compared with the US for all age/gender groupings (Tables 4 and 5).

Table 1 Urinary BPA concentrations (ng/ml) for Canada (CHMS 2007–2009) and the US (NHANES 2007–2008) for participants 6 to 79 years of age and by gender.
Table 2 Urinary BPA concentrations (ng/ml) by age group for Canada (CHMS 2007–2009) and the US (NHANES 2007–2008).
Table 3 Comparison of mean body weights (kg) by age group and gender for the US (NHANES 2007–2008) and Canada (CHMS 2007–2009).
Table 4 BPA intakes (ng/kg-day) for Canada (CHMS 2007–2009) and the US (NHANES 2007–2008).
Table 5 BPA intakes (ng/kg-day) by age group for Canada (CHMS 2007–2009) and the US (NHANES 2007–2008).

Comparison of CHMS and NHANES Methods

The following methodological aspects and their potential impact on spot urinary BPA levels are described below: urine collection and population characteristics (Table 6), analytical procedures (Table 7), and data reporting.

Table 6 Comparison of relevant aspects of the CHMS and NHANES programs and population characteristics.14, 16
Table 7 Analytical method comparison for urinary BPA: INSPQ (Canada) and CDC (the US).

Urine Collection and Population Characteristics

Urine collection. The CHMS performed field blank testing with distilled water at all sites to account for baseline contamination from the site environment, collection materials, and transport method. The field blank procedures mimicked all the procedures of the survey samples including the urine collection, handling, storage, shipping, and analysis. After adjustment for reagent blanks, field blank BPA data indicated no BPA contamination.14 NHANES does not use field blanks but rather tests all new collection materials to ensure no contamination exists. CHMS requested midstream urine samples; NHANES did not specify portion of urine stream to be sampled.

Collection timeframe. Although CHMS and NHANES samples used in this study were not collected over identical time periods (Table 6), there was only a 2-month shift in sampling times between the two surveys, which is not expected to influence overall results.

Population sampled. In NHANES, urinary BPA data were available for ages 6 years and older (no upper age cutoff), while in CHMS the age range was 6 to 79 years. For the CHMS sample with urinary BPA data, 81.2% of the respondents were white and <3.9% were black while for the NHANES 40% of the participants were white and 24% were black (blacks were oversampled and sample weights applied to produce an unbiased national estimate) (http://www.cdc.gov/nchs/tutorials/nhanes/SurveyDesign/SampleDesign/Info1.htm). In the United States, blacks had higher urinary BPA levels (geometric mean of 2.6 ng/ml, 95% CIs: 2.4, 2.9) compared with whites (geometric mean of 2.1 ng/ml, 95% CIs: 1.9, 2.3). Owing to the small sample size, the urinary BPA levels of black Canadians could not be assessed. The median urinary BPA (95% CIs) for whites in the United States was 2.1 ng/ml (1.8, 2.3), which was essentially the same as the median for the overall US population. For whites in Canada, the median urinary BPA (95% CIs) was 1.3 ng/ml (1.2–1.5), also the same as for the overall population.

Fasting. For NHANES, respondents aged 12 years and older were instructed to fast for 9 h (but not >16 h) before their morning appointment. Fasting was not required for respondents <12 years of age and for those with afternoon or evening appointments (http://www.cdc.gov/nchs/data/nhanes/nhanes_07_08/HouseholdInterviewer_07.pdf). For CHMS, respondents were instructed to fast for 12 h before the morning appointment or for 2 h before an afternoon or evening appointment.13 Information on hours since food/drink was consumed before the appointment was recorded for each respondent.

In this study, overall, no association between fasting time and urinary BPA level was found for either survey. For NHANES, younger respondents (6- to 19-year olds) had shorter fasting times (median of 2 h for 6- to 11-year olds and 4 h for 12- to 19-year olds) compared with older participants (median of 5 h for 20- to 79-year olds) and had higher urinary BPA levels (Table 2).

Analytical Procedures

Major differences between analytical procedures in the US and Canada include the use of liquid versus gas chromatographic techniques and the use of enzymes during the hydrolysis stage with different glucuronidase and sulfatase activities (Table 7). The use of different analytical methods is less important than whether those methods produce comparable results18; both INSPQ and CDC use an internal calibration standard (13C-labeled BPA) to compensate for some of the analytical differences. However, utilization of enzymes with different efficiencies cannot be completely compensated for by internal standardization and could generate bias in the overall results.

Seven samples from the two proficiency programs were analyzed by both laboratories (Table 8). Individual laboratory data are not provided here in order to protect confidentiality of the participating laboratories’ identities in these proficiency testing programs. For CDC and INSPQ, only their mean urinary BPA concentration from the seven PTMs is compared with a consensus concentration; the consensus concentration is defined as the mean of the seven median values from all reporting laboratories (Table 8). The results are as follows (CDC value was provided by the NCEH laboratory for this comparison):

  1. 1

    Consensus PTM mean urinary BPA concentration 4.77 μg/l

  2. 2

    INSPQ PTM mean urinary BPA concentration 4.98 μg/l

  3. 3

    CDC PTM mean urinary BPA concentration 5.81 μg/l

Table 8 Description of proficiency testing materials used for inter-laboratory comparisons of analytical methods.

The INSPQ mean concentration is 14% lower than that for CDC. As compared with the consensus value, INSPQ and CDC results are 4% and 22% higher, respectively. Considering the differences in analytical methodology and the complicated nature of sample preparation, these differences are minor. The proficiency testing was conducted using samples with a urinary BPA concentration of approximately 5 μg/l. It is not known how or whether the percent difference observed between laboratories would change if the PTM concentrations were closer to the LOD.

Data Reporting

For urinary BPA, 151 respondents for NHANES and 507 respondents for CHMS had values <LOD (6.1% and 9.3%, respectively). The difference in the method of reporting values <LOD did not substantially affect urinary BPA concentrations (e.g., for NHANES, considering all participants, there was no difference in the geometric mean value for urinary BPA using LOD/2 or LOD/sq rt 2 (2.1 ng/ml, CIs:1.9, 2.3)).

Both the US and Canada set the LOD at 3 × SD and limit of quantitation (LOQ) at 10 × SD of replicate analyses (5 or 10) of samples with concentrations near the LOQ. Measurements with concentrations between the LOD and the LOQ are reported as above the LOD in the final data sets in both surveys. Both laboratories performed reagent blank checks but only INSPQ found slightly higher results than LOD that had to be subtracted from reported data. An assessment of whether this adjustment could lead to a bias found a minimal impact on overall results and no negative concentrations after blank subtraction indicating no over-correction that could produce lower results.

Discussion

Biomonitoring data for several chemicals are now available for two neighboring countries — the US and Canada — providing researchers with the opportunity to compare chemical concentrations and develop hypotheses regarding exposures. Before these types of comparative assessments are conducted, researchers should be aware that the CHMS and NHANES are not identical; similarities and differences in the surveys should be assessed before performing comparisons. We used BPA as a case study to explore factors that might bias comparisons between surveys. Specifically, we focused on urine collection methods, population characteristics, analytical procedures, and data reporting methods. We discuss each of these here.

Urine Collection and Population Characteristics

Urine collection. One difference that could impact comparability of results is the use of field blanks by CHMS but not by NHANES. BPA in the body is rapidly metabolized by the liver to conjugated BPA and is excreted as essentially completely conjugated BPA. To analyze urinary BPA, the BPA conjugates are converted back to free, or parent, BPA via digestion with enzymes. Thus, laboratory measurements are of free BPA and cannot distinguish between environmental BPA, which is in the free form, and physiologic BPA, also in the free form after digestion of the urine. Without field blanks, it is not possible to determine whether sample contamination from environmental BPA sources or collection and storage devices impact urine sample measurements.19 In the CHMS, field blanks were used to quantify contamination; however, no comparable field blank data are available for the NHANES program. One possible source of contamination of urine samples is dust, which was detected, for example, in 95% of dust samples in the United States, albeit at very low levels (<0.5 to 10,200 ng/g; mean 843 ng/g; median 422 ng/g).20 To date, there is no evidence that environmental levels are high enough to substantially impact overall levels in urine, although this could conceivably be a source of bias in studies of low levels of parent BPA in serum.19

Population characteristics. Certain aspects of the sample populations in the CHMS and NHANES could impact comparison of biomonitoring results. First, the upper age cutoff differs for the two surveys. NHANES includes individuals ages 6 years and older, while CHMS has an upper age cutoff of 79 years of age. For urinary BPA measurements, this is unlikely to hinder inter-survey comparisons; limiting the NHANES population to ages 6 to 79 years resulted in the removal from the database of only 134 individuals and did not change the geometric mean urinary BPA concentration for the overall population. However, for other chemicals, attention should be paid to the effect of different age-based exclusion criteria. The effect of exclusion should be tested on case-by-case basis and is likely especially important for chemicals with age-dependent biomonitoring data results such as lipophilic, bioaccumulative compounds.21 Second, body weights for the Canadian population are lower than those for the United States, which for equivalent urinary BPA levels would produce higher intake estimates. Third, researchers need to recognize differences in race and ethnicity between the populations in the US and Canada. The Canadian population has a far smaller proportion of blacks than whites compared with the US population. Given that urinary BPA levels in blacks were statistically significantly higher than for whites, this inter-country difference in population make-up raises potentially important questions for urinary BPA comparisons — as well as comparison of other biomonitored chemicals — between countries. For example, how does the difference in racial make-up of the two populations affect distributions of concentrations in the overall populations? In addition, is the observed difference in urinary BPA concentrations between races due to factors related to diet and/or other lifestyle variables, or is it due to differences in the way whites and blacks metabolize certain chemicals? Racial variations in enzymes that catalyze the sulfate conjugation of drugs, other xenobiotics, neurotransmitters, and hormones have been observed.22 Given the difference in racial/ethnic make-up of the CHMS and NHANES populations, exploration of this topic is warranted.

Fasting times. As a result of the short physiologic half-life of BPA and the assumption that the preponderance of BPA exposure is via diet, the duration of fasting before urine sample collection could have a substantial impact on measured urinary BPA levels. Systematic differences in fasting times between countries could therefore hinder the ability to directly compare urinary BPA data from the CHMS and NHANES. We examined this by comparing NHANES and CHMS fasting times to log urinary BPA concentrations and found essentially no correlation. A lack of correlation between fasting time and urinary BPA was also reported by Braun et al.23 Several factors could contribute to this result. First, non-dietary sources may contribute to BPA exposure. Rudel et al.24 found that dietary changes to reduce exposure to food packaging resulted in a 66% decrease in urinary BPA levels, suggesting that diet contributes to some, but possibly not all BPA exposure. Alternatively, it is possible that incorrect participant reporting of fasting times could contribute to the observed lack of correlation between fasting time and urinary BPA concentration. To address this, improved data on fasting adherence and reporting in future studies are needed. The impact of variations in — and adherence to — fasting time in NHANES and CHMS should be examined on a chemical-by-chemical basis when comparing biomonitoring data across countries. Finally, there are most certainly differences in participant exposures to BPA in terms of both timing and amount, which hinder our ability to evaluate the effect of fasting on urinary BPA concentrations (e.g., a large intake coupled with a long fasting time might yield a similar urinary BPA level as a small intake coupled with a short fasting time); the importance of this issue cannot be quantified without additional research.

Analytical Methods

Results from the proficiency testing revealed that CDC's results were approximately 14% higher, on average, than those for INSPQ. It is important to note that this difference is based on a small number (N=7) of samples and that individual data for the samples are not available to further assess consistency. However, the proficiency testing process raised an interesting question related to the efficiency of the enzymes used to deconjugate the conjugated BPA in urine. CDC and INSPQ utilize different enzymes for deconjugating BPA conjugates; the enzyme used by CDC may be more efficient at breaking the sulfonate conjugate. Preliminary investigations conducted by INSPQ using unspiked urine samples to compare enzyme efficiency revealed that CDC's enzymatic conditions produce concentrations that are 10% higher than INSPQ's concentrations (unpublished data), suggesting that enzyme efficiency is the major contributor to the analytical bias.

In general, inter-laboratory proficiency testing is an important step that should be conducted for any chemical for which inter-survey comparisons are attempted. However, for BPA, it is noted that although there is a slight positive bias, on average, between results produced by the CDC compared with INSPQ, and recognizing the limitations of the available inter-laboratory proficiency testing data, analytical method bias does not appear to fully explain the difference in the results of urinary BPA concentrations between the American and Canadian populations.

Data Reporting

Two data reporting issues were identified that should be considered when comparing CHMS and NHANES biomonitoring data. The first relates to differences in limits of detection and the method for reporting measures below the LOD. For urinary BPA, although the LODs and the method for reporting measures below the LOD differ, there were too few non-detects for this to substantially impact overall urinary BPA levels. For chemicals with high frequencies of non-detects (e.g., dioxins), consideration of this difference in data reporting will be more important.25

The second issue concerns temporal variability in chemical concentration and frequency of reporting. There are four timeframes in the United States for which there are adult population data from NHANES on urinary BPA (Figure 1). Examination of the temporal trend for urinary BPA in the United States reveals variability that is not likely due to overall changes in exposure as there is no evidence of major changes in use of BPA over this time period4 nor is the variability because of changes in analytical methodology (Calafat AM, personal communication). Research on temporal variability in individual urinary BPA levels7 suggests the possibility that the short physiologic half-life of BPA coupled with variations in day-to-day individual exposure could be responsible for some of the variability observed in Figure 1. Additional years' worth of data may be required to observe actual trends in urinary BPA levels. The temporal variability further suggests that comparing urinary BPA levels across countries without taking into account year-to-year variability is inappropriate. In fact, the geometric mean concentration for the CHMS 2007–2009 urinary BPA data (1.1 ng/ml, 95% CIs: 1.0, 1.2) is comparable to the geometric mean concentrations for the NHANES data from 1988 to 1994 (1.3 ng/ml).26 For chemicals with long half-lives, less extensive temporal data reporting may be required for inter-survey comparisons.

Figure 1
figure 1

Adult urinary bisphenol A (BPA) levels (μg/l, geometric mean and 95% CIs) in the US and Canada. Data for 1988–1994 are from26 (no confidence intervals were given; ages 18 and older). For the remaining years, data are from the online Centers for Disease Control and Prevention data tables for urinary BPA in adults 20 years and older (limit of detection (LOD)/sq rt 2) (http://www.cdc.gov/exposurereport/data_tables/URXBPH_DataTables.html) and from Canadian Health Measures Survey data for adults 20 years and older (truncated at age 79 years and using LOD/2).

Two relationships have remained relatively consistent in the US and Canada: the overall relationships between gender and age and urinary BPA. In all data sets except one, males had higher urinary BPA levels than females (in NHANES 2007–2008, median levels were the same for males and females) (Figure 2). In general, younger people have higher urinary BPA levels than older people, although the relationship between the 6- to 11-year age group and the 12- to 19-year age group has fluctuated depending on the timeframe examined (Figure 2).3, 4

Figure 2
figure 2

Urinary bisphenol A (BPA) estimates (ng/ml, median and 95% confidence intervals) in the US (National Health and Nutrition Examination Survey 2003–2004, 2005–2006, and 2007–2008) and Canada (Canadian Health Measures Survey 2007–2009) for men and women and for several age groups over four timeframes (using limit of detection (LOD)/2 for measures below the LOD).

Conclusions

We identified several dissimilar methodologic aspects of the NHANES and CHMS. The differences assessed in this study appear to have minimal impact on the interpretation of comparative urinary spot BPA measures and BPA intake from the two surveys. An earlier review of methodologic differences in measurements of dioxins in breast milk highlighted the importance of evaluating study design before comparing data sets from different research groups.27 We recommend that before developing hypotheses regarding comparisons of biomonitoring data between surveys from different countries, a thorough review of methodologic aspects that might impact biomonitoring results be conducted.

We further recognize the ongoing controversy regarding the interpretation of studies of toxicity of BPA but note that health-based guidelines are available with which to compare population-based BPA intakes estimated from the CHMS and NHANES data. A TDI of 50 μg/kg-day (50,000 ng/kg-day) has been established by the European Food Safety Authority28; the same value is used by the US Environmental Protection Agency as an RfD.29 A provisional TDI of 25 μg/kg bw-day (25,000 ng/kg-day) was established by Health Canada.30 Based on the intakes estimated for US and Canadian populations (Tables 4 and 5), regardless of age or gender, all intakes are well below the health-based guidance values set by the US, Canada, and the European Food Safety Authority. For example, the 90th% intakes for 12- to 19-year olds, which had the highest intakes for any age or gender breakdown in both countries, were more than two orders of magnitude below the TDI for Canada and the RfD for the US.