Introduction

Over 10,000 facilities across the USA emit from 1 to 10,000 lb of lead particulates annually [1] contaminating local air and soil. The current blood lead threshold for toxicity is 10 μg/dL; however, increasing numbers of studies suggest that low-level environmental lead exposure yielding child blood lead levels consistently below 10 μg/dL are associated with neurocognitive and motor deficits [2]. Children living in lower socio-economic conditions are at exponentially higher risk of lead exposure [3] suggesting a major health disparity with life-long implications [4].

Understanding neurotoxic risk in cases of chronic low-level lead exposure requires wide-scale testing and monitoring. Convenient, cost-effective devices that can provide rapid, valid, and reliable low-level blood lead quantitation are needed. The portable LeadCare® System (LCS) anodic stripping voltammetry-based point-of-care device provides one option (ESA, Inc., Chelmsford, MA, USA). With a lower detection limit of 1.4 μg/dL, this convenient low-cost device offers an attractive possibility for studies of chronic low-level lead exposure.

In clinical field trials conducted by ESA Biosciences (product insert information), the correlation of the LCS device and graphite furnace atomic absorption spectroscopy (GFAAS) reference method was 0.979 overall.

Results from independent studies comparing LCS to GFAAS have varied. In a study of 208 paired blood samples obtained from lead-exposed industry employees [5], LCS values indicated a positive bias. For values ≤10 μg/dL (16/208) the mean difference was 3.41 μg/dL (±7.61); for values >10 μg/dL, the difference was substantially less (mean difference = 0.57, SD ±5.35). When method differences were plotted against GFAAS mean values, 95% of differences fell within ±11 μg/dL; linear regression yielded an r 2 = 0.67. Another study of 76 paired samples comparing LCS to GFAAS [6], reported r = 0.95 (r 2 = 0.90) overall for blood lead levels between 1.3 and 55 μg/dL; for blood lead values between 0 and 15 μg/dL (n = 31) the correlation was substantially lower, r = 0.697 (r 2 = 0.486). In addition, overall the 76 blood lead levels determined by LCS were significantly lower than those obtained via GFAAS (t = −7.73, p < 0.001). These results were noted to be consistent with an earlier study that reported overall lower mean blood lead levels for LCS as compared with GFAAS [7]. In another study of pregnant women and neonates comparing LCS to GFAAS [8], 95% of whom had blood lead levels below 10 μg/dL; no correlation between methods was found.

Thus, comparison studies have used GFAAS as the “gold-standard” method. ICP-MS, a relatively newer detection method, may offer several advantages over GFAAS. ICP-MS has greater dynamic range and sample throughput is approximately twice that of GFAAS. Additionally, matrix interference is reduced in ICP-MS, reproducibility is dramatically improved, and less sample is required for analyses.

Moreover, most past comparison studies relied on correlation and group comparison methods. The limitations of these statistical approaches for examining the comparability of methods have been described [9]. For example, a comparison of means (e.g., paired t test or Wilcoxin non-parametric test) may suggest that the method means do not differ significantly; however, similar mean values can result from pairs of measurements that differ broadly. For this reason, a non-significant difference between method means does not provide a measure of method comparability. A correlation coefficient is also problematic for comparison of methods. A correlation coefficient indicates only the amount of numeric agreement between two groups of numbers, not the amount of difference between pairs of values. Correlation cannot reveal bias; and the magnitude of the correlation coefficient (r) is strongly influenced by the range of values under consideration, that is, smaller ranges tend to produce smaller r values. Moreover, the significance test for correlation is based on the hypothesis that the population r is zero; this hypothesis is irrelevant to the question of whether methods have sufficient agreement [9].

The Bland–Altman approach is a recommended alternative to regression and group comparison methods [10]. The Bland–Altman approach plots the difference between method values (y-axis) against the average of method values (x-axis), to provide an assessment of the level of agreement between two methods. (The definition of a suitable level of agreement varies between clinical applications.) Instrument bias is readily detected. The accuracy and precision of the LCS system as compared to ICP-MS for the detection of child blood lead levels below 10 μg/dL have not yet been examined using this approach. Here, we apply the Bland–Altman approach to examine whether the LCS and ICP-MS agree sufficiently closely for the purposes of detecting lead exposure below 10 μg/dL in children.

Materials and Methods

Blood Collection Procedures

Permission to conduct these studies was obtained from the El Paso Independent School District Research Board and this project was approved by the Institutional Review Board of the University of Texas in El Paso. The sample was drawn from one elementary school located in downtown El Paso, Texas. Participation was on a volunteer basis. The children tested in this sample represented 34% of the total number of enrolled children. Parents completed informed consent prior to study initiation and all children completed informed assent prior to blood draws.

Testers cleaned their hands with antiseptic foam soap and wore personal protective equipment during blood collection. At the start of the procedure, each child washed their hands with antiseptic foam soap and the fingers of the left hand were wiped clean with chelating towelettes specially formulated for industry use to remove lead, and also nickel, silver, cadmium, and arsenic from the skin surface (D-Wipe™, Esca-Tech, Inc. Milwaukee, WI, USA). Saf-T-Pro™ 1.8 mm lancets were used to prick the forth finger of the left hand. Blood collected from the same draw was used to analyze the level of circulating blood lead by both the LCS and ICP-MS.

For LCS device analysis, blood was collected into a pre-measured 50 μl glass capillary, plunged into an ID labeled reagent vial, mixed by tipping the bottle ten times, and tested within 3 h of initial collection. For ICP-MS, an additional 100 μl of blood was collected into a micro-vial and refrigerated until transfer to the ICP-MS laboratory, typically within 72 h.

LeadCare® System Analysis

Instrumentation

The LeadCare® System device is a hand-held instrument with an operational range of 0.0–65.0 μg/dL, and a lower detection limit of 1.4 μg/dL. (The LeadCare II®, released to market in 2006, has a lower detection limit of 3.2 μg/dL). The LeadCare® System (ESA Magellan Biosciences, Chelmsford, MA, USA) product information suggests that the instrument is reliable for the detection of blood lead levels up to 65.0 μg/dL and reports that in clinical field trials the correlation of the LCS device and GFAAS reference method was 0.979 overall. The LCS testing device electrochemically measures the amount of lead in whole blood via a 1.4 × 4.2 cm colloidal gold electrode [11] onto which a drop of the mixed sample is applied. Two identical devices were used for this study. The instruments were calibrated at ESA, Inc. prior to shipment, and low and high control samples (target ranges of 7.0 ± 3.0 and 28.5 ± 4.0 μg/dL) were used for quality control at the beginning and end of each test day. Calibration buttons, included with each reagent kit, were checked to confirm that the lot code detected by the device matched the codes on the reagent containers.

Sample Treatment and Analysis

All recommended procedures for use of the LCS device were followed. For each individual sample, 50 μl of fresh-drawn blood was plunged into the proprietary reagent which digests the red blood cells in the pre-measured matrix, thus releasing lead atoms for analysis. One drop of the mix was applied to the appropriate spot on the test strip, a push of the start button initiated the 3-min testing period after which the lead value was displayed on the electronic display. Each value was immediately recorded on the vial and in a laboratory record book. All values were simultaneously entered into an electronic database and later checked for accuracy. For test–retest comparability, three consecutive tests of the same sample (i.e., three test strips per child) were performed. All test–retest values were recorded in the laboratory record book and in an electronic database.

ICP-MS Analysis

Instrumentation

ICP-MS analyses were performed with an Agilent 7500ce ICP/MS equipped with an octopole reaction system and a CETAC ASX-520 autosampler. Detection of 208Pb was performed without the use of collision gas. Samples were introduced into the plasma through a MicroMist U-series nebulizer (Glass Expansion, Australia) and a double-pass quartz spray chamber (Agilent, Santa Clara, CA, USA). Instrument parameters were: carrier gas, 0.78 L/min; makeup gas, 0.15 L/min; RF power, 1,420 W; spray chamber temperature, 2°C.

Sample Treatment and Analysis

Triplicate ICP-MS measurement requires more sample than could be drawn from one child at a time using the finger-stick method. Therefore, certified whole blood standards (Le Centre de Toxicologie du Quebec) were analyzed to determine instrument reproducibility. Specifically, ten solutions were prepared (see below) for each of two standards (4.00 and 6.59 μg/dL) and each of those were analyzed three times by ICP-MS. Standard concentrations were chosen to approximate the low-level lead values of children.

Blood standards were prepared as previously described. (Agilent technical note #5988-0533EN) Briefly, 5.58 mL of water (18 MΩ DI, Labconco WaterPro® PS Station, Kansas City, MO, USA) was placed in a polypropylene tube into which 300 μL of whole blood was added, followed by addition of 60 μL of aqueous internal standard solution (100 ppb each germanium, yttrium, and terbium in 5% nitric acid, Fisher Optima) and 60 μL of aqueous 10 ppm gold in 3% hydrochloric acid (EMD Chemicals) solution. The final dilution was 20-fold, the final internal standard concentration was 1 ppb and the final gold concentration was 100 ppb. A six-point external calibration curve was prepared from a lead stock solution in 1% nitric acid. ICP-MS standard solutions containing the elements in 2% nitric acid were obtained from Inorganic Ventures (Christiansburg, VA, USA). Samples were vortexed for a few seconds prior to a 1 min centrifugation at 2,000 rcf and the supernatant analyzed by ICP-MS.

Data Analysis

All data were combined into a single electronic database and checked for accuracy and outliers. MedCalc™ software was used to conduct Bland–Altman tests of repeatability within methods, and comparability across methods. The statistical methods used in this study followed accepted reporting standards for Bland–Altman analysis of method comparability [12] as summarized below.

Repeatability

The Bland–Altman approach assumes that the measurements to be compared have comparable repeatability (measurement error) given the units of measure examined. Measurement error for each instrument was assessed using Bland–Altman plots comparing differences between triplicate repeat measurements of the same sample for LCS and ICP-MS. Because triplicate repeats were used, three difference scores were calculated and then averaged. For this application a priori comparable repeatability was defined ≤0.1000 μg/dL average difference (for three runs).

A Priori Limits of Agreement

The limits of agreement between measurement approaches were set before analyses were conducted and were based on the clinical significance of the proposed variation. The specific blood lead level thresholds above which changes in cognition and behavior can be expected to occur in a majority of children are not yet known, thus the variation limits surrounding these thresholds are also unknown. In the absence of other knowledge, the previous approach to diagnosing “toxicity” was used as a logical benchmark. Lead “toxicity” is defined in whole number values; most recently that threshold was set at 10 μg/dL [13, 14]. A clinical “worst case scenario” would be a false negative, that is, a child at risk of lead-induced neurotoxicity is not detected because of a negative bias in the instrumentation. Thus, using a 1 μg/dL interval as a differentiating threshold, in order to ensure that 95% of children (i.e., ±2 SDs) are accurately detected, the mean difference should be less than 1 μg/dL; the standard deviation of method differences should be non-significant and no greater than 0.5 μg/dL (2 SDs = ±1.0 μg/dL); and bias should be non-significant.

Plot of Difference Values

The x-axis values against which differences were plotted, equaled the average value of the two methods. The relationship between measurement error and average difference values was determined, and confidence intervals were calculated [15].

Results

Blood lead levels were measured in 196 children (91 males, 105 females, mean age = 8.25, ±1.86, range 5.2–12.8 years). Demographic characteristics of the sample are provided in Table 1. The LCS returned blood lead levels below its reported lower limit of detection (1.4 μg/dL) for 24 children (12%) and these cases were excluded from comparability analyses; 172 cases were analyzed for comparability (77 males, 95 females, mean age = 8.20 ± 1.87, 5.2–12.8 years).

Table 1 Clinical and demographic characteristics of 196 children

The distribution properties of blood lead levels obtained, including and excluding cases with LCS values below 1.4 μg/dL, are shown in Tables 2 and 3, respectively. Blood lead levels determined by ICP-MS were higher. Both methods yielded distributions that were positively skewed; significant kurtosis (peaking) was also apparent for both detection systems. The exaggerated skew and kurtosis noted for the LCS values resulted from truncating the lowest values.

Table 2 Descriptive statistics for current blood lead level estimated by LeadCare® System and ICP-MS, N = 196 children, ages 5.2–12.8
Table 3 Current blood lead level estimated by LeadCare® System and ICP-MS, N = 172 children, ages 5.2–12.8 (24 excluded for values below LOD)

Repeatability

Triplicate repeat analyses were examined for each device by plotting the differences (returned by each pair of test runs) against the mean. For the LCS, 107 triplicate repeats were completed; 95 of these yielded blood lead levels above 1.4 μg/dL. The LCS system repeatability for test run 1 vs. 2, 1 vs. 3, and 2 vs. 3 are illustrated in Figs. 1, 2, and 3, respectively. The mean of differences for three runs was 0.066 (CI = −0.064 to 0.111). The average standard deviation of differences was 0.429 μg/dL.

Fig. 1
figure 1

LeadCare® system repeatability: Bland–Altman plot of differences for tests 1 and 2

Fig. 2
figure 2

LeadCare® system repeatability: Bland–Altman plot of differences for tests 1 and 3

Fig. 3
figure 3

LeadCare® system repeatability: Bland–Altman plot of differences for tests 2 and 3

ICP-MS measurement repeatability tests are illustrated in Figs. 4, 5, and 6. For ICP-MS, the average of the mean difference was 0.092 μg/dL (CI = 0.056–0.106). The average standard deviation of differences for ICP-MS repeated tests was.0781. Thus the mean differences achieved were comparable with regard to a priori criteria (≤0.100 μg/dL). It should be noted however that the average of standard deviation of values for the LCS were 5.5 times greater than the average standard deviation of values returned from the ICP-MS device (as reflected in the confidence intervals).

Fig. 4
figure 4

ICP-MS repeatability: Bland–Altman plot of differences for tests 1 and 2

Fig. 5
figure 5

ICP-MS repeatability: Bland–Altman plot of differences for tests 1 and 3

Fig. 6
figure 6

ICP-MS repeatability: Bland–Altman plot of differences for tests 2 and 3

Device Comparability

Figure 7 illustrates the value differences and the average standard deviation between methods. On average, the LCS returned values that were 0.457 μg/dL lower than values obtained with ICP-MS (CI = −0.610 to −0.305). The lower limit of differences was −2.441 μg/dL (CI = −2.702 to −2.180), and the upper limit of differences was 1.526 (CI = 1.265–1.787). The standard deviation of differences was 2.012 μg/dL. LCS values were 13% lower than those from ICP-MS with a standard deviation of percentage differences of 32.077%.

Fig. 7
figure 7

Bland–Altman plot of differences between LeadCare® System and ICP-MS device

Discussion

Blood lead levels have a log-normal (positively skewed) distribution confounding the use of parametric statistics (e.g., ANOVA, correlation, and regression). For this reason, studies of neurocognitive deficits in children with blood lead levels frequently group children by “exposed” or “not exposed” according to an a priori threshold (e.g., 2.0, 2.5, or 3.0 μg/dL). Currently, the choice of threshold is relatively arbitrary because blood lead level thresholds indicating neurotoxic risk during childhood have not yet been established. Thus far, findings have led investigators to suggest that blood lead levels as low as 2.0 μg/dL may indicate significant neurotoxic risk [2]; findings may suggest that increments of 0.5 μg/dL distinguish whether and how neurobehavioral function is altered [1619]. Given the complexity of child brain development, it would not be surprising to find that different brain pathways, and thus different neurobehavioral functions, are associated with different (low) level thresholds.

The results of this Bland–Altman comparison suggested that the bias and variability of LCS blood lead levels contra-indicate the use of this device for studies attempting to determine critical thresholds of neurotoxic risk in cases of low-level lead exposure. More specifically, the use of this device in research examining the neurobehavioral effects of low-level lead exposure could confound the determination of critical thresholds. At the same time, it is important to note that the difference between LCS and ICP-MS and the variability LCS fall within an acceptable range for clinical use. Until such time that lowest-level neurotoxic effect levels are established, this device is completely appropriate for the evaluation and monitoring of blood lead levels of individual children in clinical settings.

Conclusions

The reproducibility and precision of the LCS reported in this study suggest that this device is appropriate for the clinical evaluation and monitoring of blood lead levels among individual children. We also conclude however that the LeadCare® System is not useful for research that is attempting to determine neurotoxic effect thresholds for child lead exposure yielding blood lead levels below 10 μg/dL. Replications are required to confirm these conclusions. A precise, convenient, and low-cost device is needed for research investigating the neurotoxic effects of child lead exposure yielding blood lead levels consistently below 10 μg/dL.