1 Introduction
The history of validations for tissue oxygen saturation (StO
2) measurements of the brain dates back to 1991. McCormick et al. [
1], first described the comparison of a Near-Infrared Spectroscopy
(NIRS) monitor (INVOS
® 2910, Somanetics Corp. acquired by Medtronic, Dublin, Ireland) to a mixed bed of arterial, venous, and capillary blood in the brain, using a weighted blood reference consisting of both arterial and venous blood. Pollard et al. [
2] validated the first US FDA cleared commercial NIRS cerebral oximeter (INVOS
® 3100, Somanetics Inc., acquired by Medtronic, Dublin, Ireland) with a weighted blood co-oximetry reference of 0.75 × jugular bulb oxygen saturation (SjbO
2) and 0.25 × arterial oxygen saturation (SaO
2) [
3,
4]. Henson et al. [
5] and Shah et al. [
6] followed with similar comparison studies with the INVOS 3100 monitor. Several years later, the first-generation FORE-SIGHT cerebral oximeter was validated against a cerebral blood weighted reference of 0.70 × SjbO
2 and 0.30 × SaO
2, [
7‐
10], which was supported by PET studies by Ito, et al. [
11] Other NIRS cerebral oximeter validations on both adult and pediatric subjects adopted the weighted 70:30 SjbO
2:SaO
2 reference [
12‐
14].
For tissue oxygen saturation (StO
2) measurements of somatic (non-cerebral) locations, the reference used for NIRS monitor StO
2 comparative and validation studies has been more varied and has included in-vitro comparisons. Research NIRS devices monitoring skeletal muscle were compared to a local venous oxygen saturation value from a blood draw during exercise [
15‐
17]. The Hutchinson InSpectra™ (Hutchinson Technology Inc., Hutchinson, MN USA) was validated by comparing sensor measurements to blood saturation values in an in-vitro setup [
18]. The ViOptix ODISsey™ (ViOptix, Inc, Fremont, CA USA) and Invos 3100 monitors were compared to co-oximetry measurements of blood draws on isolated animal limbs [
19,
20]. For pediatrics, NIRS human somatic measurements were compared to central venous blood saturation values [
21,
22]. Later, FORE-SIGHT pediatric somatic StO
2 values were validated by comparing to a weighted blood co-oximetry weighted blood co-oximetry reference of 0.70 × central venous oxygen saturation (ScvO
2) and 0.30 × arterial oxygen saturation (SaO
2) [
23], which was supported by Pang et al. [
24] from estimating whole body venous volume ratio.
The purpose of this paper is to describe one methodology of validating NIRS based tissue oximeters accepted by the US FDA for adult clinical clearance. For other world regulatory bodies such as the European Union Medical Device Directive (93/42/EEC) [
25], there are similar requirements for clinical clearance of medical devices. This methodology of validating NIRS based tissue oximeters was used to obtain clinical clearance in the European Union, Canada, Australia, China, Japan, and Russia. Although industry methods of validation and FDA requirements have generally converged in the last two decades, there is no universally accepted reference to compare tissue oximeters against. The US FDA currently prefers oximeter validations, whether pulse oximeters, or tissue oximeters to be compared to a blood reference. The US FDA 510(K) medical device clearance method requires a reference to one or more similar function predicate devices that are validated similarly to the new medical device being evaluated. We present the methodology behind the validation of the second-generation FORE-SIGHT
® tissue oximeter (FORE-SIGHT ELITE
®, CAS Medical Systems, Branford, CT USA) for both cerebral and somatic tissue oxygen saturation (StO
2) monitoring, with rationale behind the assumptions made, selection of a comparative reference, statistical methods used, subject recruitment requirements, particularly in terms of diverse skin tones, and regulatory requirements for clinical use. This NIRS validation methodology evolved from a history of NIRS-based tissue oximeter validation publications and FDA correspondence recommending use of Deming regression and bootstrap resampling techniques for analysis of comparative data to a reference. We will demonstrate how Deming regression and bootstrapping techniques are used to validate NIRS based tissue oximeters, and the potential advantages. Bootstrapping validation allows pooling of all subject data to a best fit model used to set algorithm parameters and then performing model validation. Previous NIRS validations relied on methods involving splitting the subjects to two groups, calibration set and test set, and/or using Bland–Altman in various forms.
4 Discussion
The validation methodology of tissue oximeters to invasive blood reference values assumes a fixed venous to arterial (V:A) blood volume ratio that can be applied to all subjects. The V:A blood volume ratio likely varies, with different analyses suggesting cerebral V:A blood volume ratios ranging from 54:46 to 84:16 [
7,
82]. Because 70:30 is near the midpoint of the estimated V:A range [
7] and imaging techniques also suggest the mean cerebral V:A blood volume ratio is approximately 70:30 among different subjects in steady state healthy conditions [
11], we believe that an V:A ratio of 70:30 is a reasonable assumption for the brain. Our data indicates that if the actual V:A ratio varied 60:40–80:20 between subjects, the bias of StO
2 versus REF CX
B would change ±3.0% compared to the selected V:A ratio of 70:30. The high precision of the FORE-SIGHT ELITE (3.07% 1 SD) for cerebral StO
2 against the fixed 70:30 reference weighting across the StO
2 50–90% saturation range therefore suggests that for healthy subjects under controlled PaCO
2 conditions, the inter- and intra-subject subject variability of V:A ratio is likely less than ±10%. As an indirect comparison, pulse oximetry precision for adults derived from a controlled hypoxia study is ~2% (1 SD) when compared to arterial blood oxygen saturation [
83]. It is unlikely that in-vivo validated NIRS tissue oximetry systems will reach pulse oximeter precision, in part because NIRS tissue oximeters need both arterial and venous blood oxygen saturation co-oximeter measurements, which adds more variability to the REF CX reference measurement, and also because NIRS tissue oximetry interrogates deeper into tissues to make a StO
2 measurement. Note that an NIRS monitor cannot measure the actual V:A blood volume ratio in tissue and does not distinguish venous and arterial contributions, a common point of confusion of NIRS monitors. The V:A ratio is only used to derive a reference from blood samples during validation of the NIRS monitor.
An interpretation of this data is that the inter-subject variability of cerebral vasoreactivity during controlled PaCO
2 conditions is likely low within healthy adult subjects. The mean V:A ratio will then be likely less variable compared to other patient populations with morbidities or during uncontrolled PaCO
2 states. Therefore, validation with healthy adult subjects with controlled PaCO
2 may serve as a control. Measured precision and regression parameters would then be indicators on how the tissue oximeter performs under near-ideal conditions. A tissue oximeter that shows more variability when compared to a reference under near-ideal conditions, will likely demonstrate more variability when used as a clinical monitor. A controlled tissue oximetry validation cannot be performed for pediatric and neonatal subjects for ethical reasons and so non-healthy pediatric subjects undergoing cath-lab procedures are commonly used [
13,
23,
84,
85]. As a result, precision and regression parameters from pediatric tissue oximetry validation exhibit more variability compared to a control study [
13,
85]. Because tissue oximetry general sensor and algorithm designs are usually similar for a particular model tissue oximeter among different subject populations, the adult validation may indirectly serve as a reference for pediatric tissue oximetry performance as well.
It is understood that the cerebral venous to arterial blood volume ratio varies physiologically in the tissue vasculature that is interrogated by a NIRS sensor [
7,
86‐
88] as PaCO
2 normally varies among human and other mammalian subjects. Since CO
2 is a potent vasodilator to the cerebral vasculature, PaCO
2 levels in blood can shift the V:A ratio where high PaCO
2 levels (hypercapnia) would drive arterial blood volume ratio to be greater than 30% while low PaCO
2 levels (hypocapnia) would drive arterial blood volume ratio to be less than 30% [
89,
90]. Because hypocapnia results in vasoconstriction of cerebral arterial blood vessels, resulting in reduced flow, cerebral tissue ischemia can result [
91‐
95]. In addition to the effects of lower perfusion, a NIRS sensor would also interrogate less arterial blood volume relative to venous blood volume in the tissue. This compound effect will result in a decrease of StO
2, which would alert the clinician and warrant a check in PaCO
2 levels [
96‐
98]. Reduced minute ventilation to increase CO
2 levels is often used as an intervention to increase cerebral blood flow and resultant perfusion [
99‐
103]. In this case, a NIRS sensor would detect an increase of arterial blood volume relative to venous blood volume as well as an increase in flow resulting in an increase of StO
2, the desired effect. Therefore, we believe that a cerebral tissue oximeter validated using a controlled fixed V:A blood volume ratio REF CX
B reliably provides clinicians real time information of the effect of both adverse and beneficial changes in cerebral vasoreactivity and V:A blood volume ratio shifts.
For the somatic co-oximeter reference REF CX
S, the mean V:A non-cerebral tissue blood volume ratio was also assumed to be 70:30 among different subjects in steady state healthy conditions. This assumption was based on the findings of Pang et al. [
24] where the venous system of the whole body contains 70% of total blood volume. However, somatic tissue blood volume V:A ratios can vary greatly under normal and abnormal physiological conditions. For example, muscle exercise may dynamically change V:A ratio between contraction and relaxation. Body position, such as standing upright, may result in pooling venous blood volume in the lower extremities compared to the supine position. Therefore, for somatic validation, the subjects were in the supine position and relaxed, with negligible muscle activation resulting in resting state metabolism for the somatic sensor measurement sites. This controlled resting state appeared to effectively limit the variation in V:A blood volume ratio as evidenced by somatic StO
2 accuracy measurements within 6% (1 SD) compared to a fixed 70:30 blood volume ratio REF CX
S.
The results showed that the somatic StO
2 measurement precision and individual Deming regression slope decreased as the body location moved farther away from the heart compared to REF CX
S. The Flank StO
2 measurements showed the highest precision (4.45%), followed by Quad StO
2 measurements (5.41%), then Calf StO
2 measurements (5.91%). Because the blood in the vena cava represent the global venous blood return of the body, multiple somatic StO
2 measurements are averaged to better reflect the global SvcO
2 co-oximetry measurement as part of REF CX
S, with a precision of 4.22% compared to the next best 4.45% of the Flank StO
2 measurements alone. Due to heterogeneity in tissue oxygenation demand and metabolism, it is likely that somatic StO
2 would have some variability at different body locations. An alternative validation method for limb muscle StO
2 is to use blood from the venous return of the limb that is close to the muscle of interest [
104] as opposed to the global vena cava venous return done in this study. Somatic StO
2 measurements are best made on the larger muscles of the body, where NIRS light can diffuse and scatter unimpeded by the tissue geometry. Bony areas of the body such as ankles, wrists, and parts of the hands and feet, may alter the NIRS photon path to the sensor detectors, resulting in unreliable StO
2 measurements, particularly with larger light source to detector configured sensors.
When validating tissue oximetry data to an internal blood reference, two different data analysis methods accepted by the U.S. FDA can be chosen. The first method involves splitting the subjects to two groups, calibration set and test set [
12]. The second method involves pooling all subjects to a best fit model used to set algorithm parameters and then doing model validation using statistical techniques such as bootstrapping, which was done here. To determine which validation method to use, the following considerations need to be examined. For clinical validity and generalizability, the enrolled subject group should reflect those of the general population in terms of demographics such as weight, gender, and skin pigmentation. In a recent FDA guidance for pulse oximetry, the FDA recommends use of a minimum of 200 paired data points from at least 10 subjects where at least 2 subjects or 15% of subjects are darkly pigmented, whichever is larger [
83]. Besides skin pigmentation, inter-subject variability of deeper tissue background optical properties can have an impact on tissue oximeter accuracy when compared to a blood reference. Such inter-subject differences have been observed to result in physiologically anomalous readings or variable agreement to invasive blood references [
12]. Deep tissue optical characteristics may include the optical effects of tissue, muscle, and bone density, heterogeneous tissue pigmentation, hair follicles, and scarring from prior injuries, contusions, concussions, or facial surgeries. Furthermore, anatomical variations influence the distribution and characteristics of the various tissue contributions. Since the background deep tissue optical characteristics cannot be determined by visually examining subjects and are independent of race, an effective sample size needs to have a high probability to include a wide range of subjects with different deep tissue optical characteristics.
Two follow-up first generation FORE-SIGHT studies with comparison to the invasive reference REF CX
B [
105,
106] showed consistency in precision following validation using the modeling and statistical validation method with 17 subjects [
71]. The validation of another tissue oximeter using the calibration and test method splitting 23 subjects in two groups (11 calibration subjects and 12 test subjects) [
12] gave an unexpected result where the test accuracy measurement was better than the calibration value, which may indicate that the test group subjects had less background tissue optical heterogeneity than the calibration group. For this reason, the approach described herein using the full data set for the best fit modeling and advanced statistical validation techniques was chosen for the FORE-SIGHT ELITE. By using a larger data set and accounting for sampling variability, this method may be more reliable in predicting clinical monitor performance over a wider range of subjects with different background optical characteristics. For validations done using the split subject datasets to two groups (calibration set and test set) to match the effective sample size that includes a wide range of subjects with different deep tissue optical characteristics, the overall effective subject sample size would need to be doubled.
When considering accuracy of NIRS tissue oximeters to other oximetry systems, the semi-invasive optical based SvO
2 catheters may be the best for comparison. These catheters measure SvO
2 in venous blood vessels around the heart (central venous) and internal jugular vein/jugular bulb, part of the brain venous drainage system. SvO
2 catheters measure SvO
2 directly with an optical interface to blood where light does not pass through tissues first like tissue oximetry. For three SvO
2 catheter oximeter systems, in-vivo comparison with co-oximetry of blood samples demonstrated a precision of 4.3–7.1% (1 SD) [
107]. For the Edward Lifesciences (Irvine, CA) Vigileo™ SvO
2 catheter system, the in-vivo comparison with co-oximetry of blood samples demonstrated a precision of 4.1% (1 SD) [
108,
109]. The precision of FORE-SIGHT ELITE StO
2 for cerebral (3.07% 1 SD) and somatic (4.22% 1 SD) are very comparable to optical SvO
2 catheter oximetry systems.
An alternative method in validating NIRS tissue oximeters under development involves in-vitro tests on a liquid optical phantom [
110‐
114]. The liquid phantom contains a predetermined solution of saline, human blood hemoglobin, Intralipid
®, sodium bicarbonate, glucose, and baker’s yeast to desaturate the hemoglobin [
110,
111]. An issue that needs to be resolved is that different NIRS devices measure different StO
2 values from sensors placed on the phantom and in-vivo validated NIRS monitors produce different values than those independently measured on the blood inside the phantom [
110,
111]. This is in part due to the different algorithms of the monitors, the sensor optical configuration, how the monitors compensate for skin pigmentation and background optical properties other than hemoglobin, and the validation methodology of the monitor. Phantoms generally absorb and scatter light differently compared to that of tissue oximeter sensors placed on human subjects as evidenced by the attenuation of light from each sensor’s light source wavelengths (personal observation). If the optical properties of phantoms and biological tissue are not well matched, a tissue oximeter StO
2 algorithm may behave differently, where the value and rate of change of StO
2 compared to a phantom blood saturation reference will have a bias and different regression slope. One improvement in phantom design may include better optical spectral matching with human tissues for light attenuating components other than hemoglobin. Skin pigmentation and deeper tissue optical characteristics, which attenuate light more in the lower wavelengths <750 nm [
115] could be added to the phantoms, perhaps as a red dye, to better model these tissue optical characteristics. An ideal phantom would give the same quantitative value for the tissue oximeter parameter of interest (such as StO
2) when measured by different manufacturer model monitors, corresponding to the same quantitative parameter value measured on human subjects. In the future, an in-vivo blood co-oximetry validated monitor “A” could be used to calibrate the ideal NIRS phantom, then this phantom can be used to calibrate and/or test monitors “B”, “C” etc.
Tissue oximeter validation should be standardized so that in clinical use, StO
2 measurements between tissue oximetry models are more consistent. Areas of standardization may include using a fixed mean blood volume ratio based on best available information for which we suggest using a blood volume V:A ratio of 70:30, use of highly accurate co-oximeter models especially at lower oxygen saturation values for the reference measurements, and for adult subjects, use of a hypoxia protocol with good distribution of FiO
2 levels while controlling PaCO
2 levels to a limited range. A good distribution of skin tones from the different races are needed [
83] as well as obtaining randomly a good distribution of subject background optical characteristics by having an effective sample size. If a liquid or other optical phantom can model all these parameters, then an alternative NIRS validation method may be available in the future.
For direct comparisons of NIRS tissue oximeter models, caution is advised in interpreting the results when no comparative co-oximetry blood oxygen saturation reference (such as REF CX) is used as a control. One cannot determine which monitor is more accurate or has the more appropriate StO
2 value or rate of change [
116] during an hypoxic or ischemia event without an appropriate comparative reference. Likewise, caution is advised in interpreting comparisons of different NIRS tissue oximeter models to blood oxygen saturation references different from that of the original NIRS tissue oximeter’s validation reference such as cerebral StO2 vs central or mixed venous SvO
2 [
117‐
119]. Furthermore, results may not be comparable when the inappropriate sensor is applied outside the manufacturer’s indications for use such as an adult validated sensor to an infant subject [
120]. Both the StO
2 value and the rate of change of StO
2 to a physiological event will likely be inaccurate as the assumptions behind the sensor design and algorithm used will be different.
Ultimately, demonstrated clinical utility of NIRS tissue oximeters is important to gain acceptance for use in patient monitoring in healthcare systems. Relationships between StO
2 and both physiological parameters and outcomes variables have been discussed elsewhere [
121,
122]. Low StO
2 values has been associated with post-op complications in aortic surgery [
123], single lung ventilation [
124,
125], beach chair shoulder procedures [
126], and in cardiac surgery [
100,
101,
127]. StO
2 values provide guidance of setting ventilation controls particularly end tidal CO
2 [
103], setting safe ablation and entrainment mapping periods in ventricular tachycardia treatment [
128], targeting oxygen saturation ranges to reduce risk of retinopathy in neonates [
129], and catastrophic avoidance such as detection of misplaced cannulas and incorrect ventilation settings in surgery [
130‐
132]. More interventional studies are needed to see if goal directed therapy based on StO
2 can improve outcomes [
121]. Standardized validation of tissue oximeters allows for better cross analysis of data between different manufacturer monitor models increase the potential of finding clinical correlations with disease states, corresponding outcomes, and determining possible interventions to improve outcomes.
In conclusion, we present the validation of the FORE-SIGHT ELITE tissue oximeter and the rationale behind the assumptions made in the protocol based on our experience with these monitors. We assumed that the cerebral and somatic invasive blood reference consisting of weighted tissue mean blood volume ratio (V:A) is 70:30 at PaCO2 of 37–40 mmHg based on prior publications, and that this ratio is generally constant for healthy human subjects because of the high level of precision of tissue oximeter StO2 when compared to this invasive reference. We acknowledge that the V:A blood volume ratio normally varies in physiology and believe that monitoring StO2 is clinically important in part to show how the V:A ratio changes due to CO2 or other agents affecting tissue oxygenation. We believe that use of advanced statistical techniques such as Deming regression and bootstrap resampling to validate the best fit full data set model provides a more reliable representation of clinical performance over a wider range of subjects with different skin tones and background optical characteristics for a given sample size. Finally, we suggest standardization of tissue oximetry validation, whether in-vivo as presented, and/or in-vitro with an ideal NIRS phantom when perfected, so that tissue oximeters used in the clinic make more reliable measurements, with more consistency between different manufacturer tissue oximetry models, and therefore maximize overall utility of tissue oximetry in the clinic.