Background
Gastro-oesophageal reflux disease (GORD) is a widely prevalent disorder characterized by the disruption of the esophageal mucosa by, or perceptive sensitivity to, gastric refluxate [
1]. GORD is initially diagnosed by clinical suspicion, response to empiric trial of proton pump inhibitor (PPI) therapy, or endoscopic evaluation of the mucosal consequences of refluxate exposure [
1]. However, ambulatory reflux monitoring is indicated to provide confirmatory evidence of GORD in patients with atypical GORD symptoms, normal endoscopic evaluation, or prior to considering anti-reflux surgery [
1]. Ambulatory 24-h pH-impedance monitoring (24-h pH-imp) is considered the consensus standard for diagnosing GORD, as it provides a quantitative measure of gastro-oesophageal reflux (GOR) events through the detection of oesophageal acid exposure and retrograde movement of liquid or gas refluxate in the oesophagus via impedance monitoring [
1,
2].
Among the pH monitoring metrics, oesophageal acid exposure time (AET) is the most reproducible and specific metric to GORD [
3,
4]. Elevated AET is not only predictive of a positive response to PPI therapy [
5,
6], but multivariate analyses have also shown that elevated AET is the most predictive metric of response to medical and surgical GORD management based on dominant symptom index and global symptom severity score outcomes [
7,
8]. Oesophageal multichannel intraluminal impedance (MII) testing provides an adjunct method for assessing GOR by measuring changes in resistance to alternating electrical currents as boluses of liquid, gas, or both pass through the esophagus in retrograde fashion. Impedance is typically used to measure GOR frequency and is correlated with patient symptom perception statistically through symptom association probability (SAP). When combined, 24-h pH-imp allows for the detection of acid, weakly acid, and nonacid reflux events. However, the additional benefits of non-acid reflux detection, which cannot be gleaned from pH studies alone, and SAP correlation is limited to hypersensitive oesophagus phenotypes [
9] or when testing on proton pump inhibitor (PPI) therapy [
10].
Normal values for AET were initially determined in a study of ambulatory oesophageal pH monitoring that compared patients with the typical GORD symptoms to asymptomatic controls, generating the conventional standards of < 4.2% total AET, < 6.3% AET in the upright position, and < 1.2% AET in the recumbent position [
11]. Numerous subsequent studies have found comparable AET cutoffs have a sensitivity of 77–100% and specificity of 85–100% in discriminating esophagitis from normal controls [
3,
12‐
17], and recent consensus guidelines continue to define AET < 4% as definitively normal and AET > 6% as definitively abnormal [
1,
18].
Normal values for simultaneous impedance testing were initially determined from serial measurements of impedance measured reflux frequency in 60 healthy volunteers, where the upper 95th percentile was defined as the threshold for the diseased state: > 73 total GOR events, > 67 GER events in the upright position, and > 7 GER events in the recumbent position [
19]. However, subsequent studies of reflux frequency have found more heterogeneous results [
20], and number of reflux events alone has not been proven to be predictive of treatment outcome [
7,
8,
21]. Instead, consensus guidelines recommend GER frequency be used as an adjunctive metric when AET alone is inconclusive [
1,
18].
Clinically, we noted frequent discrepancies in 24-h pH-imp performed on patients with symptomatic GOR, often finding a number of cases with abnormal AET and normal impedance-measured reflux events. Given that AET was validated in patients with symptoms of GORD and MII was validated in healthy controls, we sought to analyze the predictive accuracy of impedance findings in relation to AET in patients with GORD symptoms off PPI therapy.
Discussion
The present retrospective review of 24-h pH-imp studies provides insight into the clinical discrepancies between the quantifiable severity of pH and impedance findings in unmedicated patients with GORD symptoms. In this sample, average total AET was 10.5 ± 9.9%, which is 2.5-fold higher than consensus AET criteria, and 63.8% of patients met pH criteria for a diagnosis of GORD (Fig.
1). However, only 22.2% of patients met MII criteria for pathologic reflux frequency (Fig.
2) despite a high median total GOR frequency of 43 (IQR 21–68) compared to validated MII studies in asymptomatic controls (Table
3) [
19,
28].
Table 3
Comparison of impedance findings between the present study in patients with gastro-oesophageal reflux symptoms and previous analyses in asymptomatic patients. The ideal cut-off points for impedance as measured by the present study fall below the conventional cut-offs as determined by the 95th percentile of asymptomatic patients [
12,
20]
A review of the historical validation of ambulatory pH and MII testing demonstrates interesting differences between the modalities. Normal values for AET were initially validated through direct comparisons between patients with and without GORD symptoms [
11], and subsequent studies found comparable AET cutoffs with reliable sensitivity and specificity for GORD [
3,
12‐
17]. Using AET as a primary metric, additional studies of pH alone or combined pH and MII monitoring have indicated that elevations in AET is the most predictive marker of response to PPI therapy [
5,
6] improvement in symptom burden [
7], and post-surgical outcomes [
8]. On the other hand, MII was initially validated in asymptomatic volunteers, on and off PPI therapy, and stratified based on 95th percentiles to define pathologic GOR frequency [
19]. Furthermore, subsequent impedance studies have been more widely heterogeneous without a clear cut-off for abnormal GOR frequency [
10,
28‐
30], and GOR event frequency has not been clearly shown to affect GORD treatment outcomes [
7,
8,
21].
With this in mind, we generated a novel analysis of the predictive accuracy of GOR frequency to identify AET-defined GORD using ROC curves (Table
2). In this model, current thresholds for abnormal MII GOR frequency are weakly sensitive (20.9–36.6%) and highly specific (80.6–100%) (Table
2). This weak sensitivity for reflux is congruent with the sub-diagnostic levels of impedance noted in our panel of patients with symptomatic GOR and may offer insight into the poor clinical outcomes data assessing GOR frequency changes in the diseased state. Additionally, the AUC for current MII GOR thresholds is low, suggesting impedance GOR measures tend to underdiagnose GORD and thus may be less useful as an adjunctive metric per the Lyon Consensus guidelines [
18].
Using the same model, AUC analyses of serial ROC curves identified a MII GOR frequency cut-off of ≥41 events has an optimized AUC of 0·83 with a sensitivity of 69.6% and specificity 80.7% for detecting pathologic levels of GOR by pH analysis (Table
3). A cut-off of 41 reflux events falls significantly below current consensus thresholds [
1,
18], but is higher than the median frequency of events seen in validated studies of healthy controls (Table
3) [
19,
28]. In this sample, lowering the GOR frequency threshold from ≥73 events to ≥41 events would increase the percentage of patients meeting GORD criteria from 22.2 to 52.7%, a number more closely approximating the incidence of elevated AET (63.8%) in the same population. A threshold of < 41 events also aligns more closely with the Lyon Consensus definition of < 40 reflux events as definitively physiologic [
18]. Reducing the threshold from ≥73 events to ≥41 events to would also raise the Cohen Kappa coefficient (κ) of inter-test reliability between AET and GOR frequency from 0.25 to 0.38. In practice, expanding the correlation between impedance and pH will help alleviate the clinical conundrum that arises when the independent assessment of pH and MII metrics differs in interpretation, thereby increasing the diagnostic confidence of the clinician and leading to more disease directed therapy.
Limiting analysis of the accuracy of MII GOR detection is a lack of a true gold standard for diagnosing GORD given its broad spectrum of diseased state ranges from functional heartburn to Barrett’s oesophagus [
18,
31]. In lieu of this, impedance was compared directly to AET due to the aforementioned reproducibility, reliability, predictive value, and prospective significance of the metric. However, a not insignificant portion of patients with typical and atypical symptoms of GORD as a result of non-erosive reflux disease, oesophageal hypersensitivity to refluxate exposure, and functional heartburn may be missed by AET detection alone, and the benefits of quantifying and characterizing non-acid GOR events have been described [
9]. Still, it should be noted that in 24-h pH-imp studies conducted while off acid suppressive therapy for a period of ≥7 days, the majority of reflux events are acidic [
10,
32‐
35]. This finding was echoed in the present study in which nonacid reflux occurred in 3 patients at a frequency of 1 event per individual. Therefore, in a population of patients with symptomatic unmedicated GORD, the major benefit of MII to detect nonacid GOR may be negligible; thus, the use of pH-based definitions of pathological reflux may be viewed as an acceptable standard [
2,
19,
36,
37].
The Lyon Consensus also summarizes novel metrics that can be obtained from ambulatory pH-impedance testing, including the post-reflux swallow-induced peristaltic wave (PSPW) index and mean nocturnal baseline impedance (MNBI) [
18]. While the PSPW index has shown promise in augmenting the diagnostic value of ambulatory pH-impedance testing, it requires an additional cumbersome manual calculation of the ratio of impedance measured reflux events that are followed by a PSPW wave. This value has been shown to correlate with esophageal body peristaltic reserve as well as discriminate pathologic acid exposure form non-pathologic acid exposure states (reflux hypersensitivity, functional heartburn, and control patients) [
38‐
40]. The aim and design of this study was to compare total impedance measured reflux event numbers generated by ambulatory reflux monitoring software (with manual verification) in high and low acid exposure states, and interpretation of the PSPW index would not be affected by changing the thresholds of normality for GOR events. Likewise, MNBI has also shown promise as a surrogate measure of the microscopic changes caused by oesophageal acid exposure that can predict relative AET and response to antireflux therapies without impact from changing GOR frequency thresholds [
39,
41‐
43]. As evaluating these novel metrics was not the focus of the present study, these parameters were not examined.
Lowering MII GOR frequency thresholds will inevitably increase the number of false positive GORD diagnoses. While this may be an acceptable risk to increase the concordance between pH and impedance metrics and a more simplified analytical process, one may also opt to utilize baseline impedance or post-reflux swallow induced peristaltic wave analysis when a diagnosis of GORD is still in question [
1,
18]. Moving forward, a prospective study utilizing these proposed cut-off values in symptomatic patients will help to establish a true sensitivity and specificity for these values.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.