The apnea–hypopnea index (AHI) defines the number of breathing pauses (apneas) and/or partial respiratory flow reductions (hypopneas) that last longer than 10 s per hour of sleep. The AHI is the main criteria used to indicate the severity of obstructive sleep apnea (OSA). Apneas are defined by a pause or at least 90% reduction in airflow and hypopneas are defined by at least 30% reduction in airflow associated with an oxygen desaturation of more than 3% and/or an arousal [1]. The American Academy of Sleep Medicine (AASM) recommends use of an oronasal thermal airflow sensor as the first-choice sensor for detection of apneas and a nasal pressure transducer as the first-choice sensor for hypopnea detection [1].

Oronasal thermal airflow sensors (Therm) use the difference between the temperature of exhaled and ambient air to estimate airflow and detect mouth breathing. The use of temperature as a surrogate for measurement of airflow is an adequate method to detect apneas, because it has the advantage of detecting both nasal and oral airflow. However, while highly sensitive, these sensors only measure airflow indirectly and do not provide quantitative measurements of airflow for detection of hypopneas.

Nasal pressure transducers (NP) are sensors capable of detecting pressure changes during inspiration and expiration. This semi-quantitative measurement of airflow pressure gives details on airflow such as the presence of flattening of the inspiratory part of the signal in case of increased upper airway (UA) resistance and also allows better evaluation of hypopneas than thermistors. The combined application of NP and Therm is necessary to detect mouth breathing, ensuring a reliable oronasal flow measurement.

In addition, flow derived from respiratory inductive plethysmography (RIP flow) from the thorax and abdominal belt signals has been proposed as an alternative signal for scoring apneas and hypopneas if the thermistor signal fails or is unreliable [1]. However, the use of RIP flow in combination with NP has its limitations. For instance, mouth breathing or misplacement of nasal cannula may lead to the absence of respiratory event detection. To overcome these limitations, several systems have been proposed that use tracheal sounds (TS) for OSA diagnosis [2,3,4,5,6].

TS signals correlate well with respiratory flow, with no significant difference in the number of apneas detected with TS or reference sensors [4, 6,7,8,9]. Tracheal sounds, recorded at the sternal notch, reflect the superficial vibrations of the body set in motion by pressure fluctuations [10, 11]. Placed on the sternal notch, the TS sensors can detect these vibrations and thus measure tracheal flow sound as well as snoring.

Our study aimed to evaluate the combination of TS and NP sensors for respiratory event detection and evaluation of apneas and hypopneas. The results were compared to those obtained with the recommended combination of thermistor and nasal pressure sensors.

Materials and methods

Patients

Patients with clinical suspicion of OSA scheduled for routine diagnostic polysomnography (PSG) were asked for their participation in the study on a consecutive basis. All patients signed a written consent and the study was approved by the local Ethics Committee (application number: EA1/009/13) of the university hospital, Charité Universitätsmedizin Berlin. Inclusion criteria were: age between 18 and 70 years, a minimum recording of 6 h, and an AHI greater than 10/h. Patients with any sleep disorder other than OSA, with clinically unstable respiratory or cardiovascular disease, with excessive alcohol consumption or any kind of drug or medication that could influence sleep were excluded from the study. Age, height, and weight as well as medication and diagnoses of the patients were recorded.

Study procedure

In addition to PSG using the Embla N7000 system (Embla Inc., Broomfield, CO, USA), a polygraph CID102L equipped with the PneaVoX® TS sensor (CIDELEC, St. Gemmes sur Loire, France) was used. Recorded data included electrophysiological signals for sleep evaluation as well as recording of airflow by NP and Therm, body position, actigraphy, RIP thoracic and abdominal movements, and pulse oximetry (SpO2). The PneaVox® sensor was taped on the skin just above the suprasternal notch and then secured in place using an adhesive bandage. Correct positioning of the transducer is an essential element to assure the high-quality of the signal. For adequate synchronization of the Embla and the CID102L recordings, the nasal pressure signal was connected to both systems using a T-adapter. Recordings were monitored, and the quality of all signals was checked throughout the night. Each PSG recording was scored manually by an experienced medical technician at the Charité Sleep Medical Center according to the AASM criteria [1]. All respiratory signals from the Embla system were imported into the CIDELEC system in European Data Format (EDF) and a new anonymized polygraph file was created for each patient.

Tracheal sound sensor

The TS sensor, PneaVoX®, is similar to a stethoscope with an imbedded combination of a pressure sensor and an acoustic sensor. While the pressure sensor could be used for characterization of respiratory events [12], the acoustic sensor measures sound variations induced by: 1) high-pitch respiratory flow sounds with an acoustic intensity less than 76 decibels and a frequency between 200 and 2000 Hz; 2) low-pitch snoring sounds with an acoustic intensity greater than 76 decibels in the transducer chamber and a frequency between 20 and 200 Hz. Fig. 1 illustrates the process from the acquisition of the tracheal sound raw acoustic signal to the extracted analyzed respiratory flow sound and snoring signals.

Fig. 1
figure 1

Schematic presentation of processing of acoustical raw signal from tracheal sound sensor PneaVoX® (CIDELEC, St. Gemmes sur Loire, France) to extract respiratory flow intensity and snoring energy

Data analysis

The synchronized recordings were independently scored for apneas and hypopneas using two different methods: 1) the TS-NP combination and 2) the Therm-NP combination. Analysis of the two methods was performed in a random order. The two scorings of respiratory events were then compared. AHI was based only on respiratory channels using validated total recording time (TRT) instead of total sleep time (TST). Only hypopneas with ≥3% arterial oxygen desaturation were included in AHI evaluation. The AHI was calculated for each method and the two values were compared for each patient.

Statistical analysis

Statistical analysis was performed using IBM SPSS Statistics v.24.0 (IBM, Armonk, NY, USA). Values are presented as mean ± standard deviation (SD). The correlation between scoring results based on TS-NP and Therm-NP was assessed by calculation of Pearson’s two-tailed coefficient. The Cohen’s kappa, sensitivity, and positive predictive value (PPV) for event detection were calculated for all patients using the Therm-NP signal combination as a reference. In addition, Bland–Altman analysis was performed for visual analysis of agreement between TS-NP and Therm-NP based on AHI scoring.

Results

Patients

The patient group consisted of 33 subjects (6 women, 27 men) with a mean age of 52.9 ± 10.3 years and a mean body mass index (BMI) of 30.0 ± 5.2 kg/m2. Twenty-one patients had concomitant diseases including one or more of the following: hypertension (n = 9), cardiac arrhythmia (n = 2), chronic obstructive pulmonary disease (COPD; n = 2), asthma (n = 2), respiratory failure (n = 1), obesity (n = 4), hypercholesterolemia (n = 3), diabetes (n = 1), and hypothyroidism (n = 1). The mean AHI based on TST from PSG (AHI-PSG) was 34.1 ± 24.2 events/h with an apnea index (AI-PSG) of 18.9 ± 25.2 events/h. Nineteen patients had mild-to-moderate OSA (5 ≤ AHI < 30/h) and 14 patients had severe OSA (AHI ≥ 30/h). The mean time in bed (TIB) was 470.1 ± 51.8 min and the sleep efficiency (SE) was 81.7 ± 11.0%.

Detection of events with TS-NP and Therm-NP

The total number of events detected using sensor combination TS-NP was 7329 (4344 apneas and 2985 hypopneas). Using the Therm-NP combination, 7268 events (3251 apneas and 4017 hypopneas) were detected. With the Therm-NP as reference detection, the sensitivity for TS-NP was 93.0% (95% confidence interval 90.8 to 95.1%) and the PPV was 90.6 (95% confidence interval 87.6 to 93.7%).

Correlation analysis for apnea and hypopnea detection using the two different sensor combinations revealed a strong positive correlation (r = 0.997, N = 33, p < 0.001) between NS-NP and Therm-NP (Fig. 2).

Fig. 2
figure 2

Scatter plot of number of apneas and hypopneas based on scoring using sensor combinations thermistor with nasal pressure transducer (Therm-NP) versus tracheal sounds with nasal pressure transducer (TS-NP)

The mean AHI based on TRT validation using the synchronized recordings was 30.0 ± 22.6 events/h using TS-NP and 29.8 ± 22.9 events/h using Therm-NP (Fig. 3).

Fig. 3
figure 3

Boxplots illustrating the apnea–hypopnea index (AHI) based on scoring using the sensor combinations tracheal sounds with nasal pressure transducer (TS-NP) and thermistor with nasal pressure transducer (Therm-NP)

With the Therm-NP as a reference detection, a kappa statistic value of 0.86 (standard error 0.08) for TS-NP revealed a high agreement for classifying OSA into the severity classes mild (5 ≤ AHI < 15/h), moderate (15 ≤ AHI < 30/h), and severe (AHI ≥ 30/h). Using TS-NP, misclassification by one severity class was observed in 3 patients (Table 1). There was a difference in AHI of more than 10% between the two methods in 4 patients.

Table 1 Contingency table for classification of subjects into mild, moderate, or severe sleep apnea based on scoring using sensor combinations of tracheal sounds with nasal pressure transducer (TS-NP) and thermistor with nasal pressure transducer (Therm-NP)

Fig. 4 displays the results of the Bland–Altman analysis for apnea and hypopnea detection using the Therm-NP sensor combination as the reference. The mean difference value of the number of detected apneas and hypopneas between the Therm-NP and the TS-NP was small, with −1.8 event difference and agreement limits between −27.7 and 25.4 events.

Fig. 4
figure 4

Bland–Altman plot illustrating the agreement between number of apneas and hypopneas based on scoring using the sensor combinations thermistor with nasal pressure transducer (Therm-NP) and tracheal sounds with nasal pressure transducer (TS-NP). SD standard deviation

Discussion

This study is the first to investigate the use of tracheal sounds in combination with nasal pressure transducer flow signals for AHI calculation in adult patients with OSA. Results were compared to those obtained with the AASM-recommended use of oronasal thermal airflow in combination with nasal pressure transducer flow signals. Overall, the use of combined TS-NP signals provided a highly accurate evaluation of OSA severity, categorizing patients correctly into severity groups as mild, moderate, or severe. Moreover, results for AHI calculation were not statistically different between the two methods. Under the assumption that the TS sensor is placed correctly on the sternal notch, its use in combination with NP provides a new method for correct identification of respiratory events in patients with OSA.

In total, 7329 apnea and hypopnea events were detected using the combined TS-NP signals. The number of events detected was only 0.8% more than that found by the Therm-NP reference scoring. In addition, the ranges of the number of apnea and hypopnea events detected were very comparable, with values ranging from 28 to 735 events for TS-NP and from 26 to 741 events for Therm-NP. The sensitivity of TS-NP (93.0%) for event detection was good, with slightly lower PPV values (90.6%).

These results are also confirmed by the Bland–Altman plot (Fig. 4), illustrating that the patient-to-patient variability of differences was small. There was only one single subject in whom the difference between methods was 48 events, due to higher number of hypopneas scored by the sensor combination Therm-NP.

Good agreement between methods is also mirrored by the fact that 30 (91%) OSA patients were correctly classified according to severity classes; however, compared to Therm-NP, TS-NP classified one patient with mild OSA as moderate, one with moderate OSA as severe, and one patient with moderate OSA as mild (Table 1).

In our study, the AASM respiratory rules for adults [1] were used as reference. These recommend use of an oronasal thermal airflow sensor for apnea detection and the use of a nasal pressure transducer for hypopnea detection as the first choices for flow measurements. In our study, we used a thermistor attached to the nose and mouth as well as a nasal cannula. Results of reference scoring with the Therm-NP sensor combination revealed an event distribution with 45% apneas and 55% hypopneas. Using the sensor combination TS-NP, more apneas (59%) than hypopneas (41%) were scored. One explanation for the lower apnea scoring rate of Therm-NP could be attributed to the fact that in certain cases, reduced flow rates of ≥90% compared to pre-event did not result in amplitude reductions of ≥90% of the Therm signal, and were therefore scored as hypopneas rather than an apneas. The tendency of thermistor sensors to underestimate reduced flow rates together with a non-linear signal characteristic has already been reported in the past [13] and mentioned by the AASM [1]. In a recent study [14], we showed that the TS signal detected more apneas compared to the Therm signal. On the other hand, the TS may indeed overestimate the number of detected apneas. In some cases of respiratory events, the TS sensor was not able to sense residual flow rates with amplitude reductions less than 90% and therefore classified them as apnea rather than hypopnea. One can speculate that there are anthropometric and other factors influencing the sensitivity of TS for sensing low flow rates such as height, neck circumference, or BMI. For instance, the Bland–Altman plot in Fig. 4 illustrates that there are two subjects with 25/28 less events scored using the Therm-NP sensor combination compared to TS-NP, and these were obese patients. However, due to the design of this study, this issue could not be systematically analyzed and remains a question to investigate in future trials.

In order to investigate these issues in detail and to quantify absolute flow rates, one has to perform gold standard recordings with a pneumotachograph (PNT). In a previous study with a prototype of the TS sensor PneaVoX®, a parallel recording with a PNT revealed no difference in the number of detected apneas [4]. A limitation of our study is that we did not measure flow with a PNT. Future studies with the TS sensor should include this measurement in order to confirm the nature of the detected events.

Conclusion

The current study shows that the combination of tracheal sound and nasal pressure sensors reliably detects the same number of respiratory events as the combination of thermistor and nasal pressure sensors, despite the difference in apnea to hypopnea ratio between the two methods. The use of tracheal sound does not modify AHI calculations and guarantees accurate sleep apnea diagnosis and severity assessment. Furthermore, the tracheal sound sensor used in this study is practical, easy to put in place, and well tolerated by patients. It does not disturb sleep and is less likely to move or be displaced during sleep than thermistors. Thus, tracheal sound sensors can be used as a substitute for oronasal thermistors in sleep recording systems. However, prospective evaluation in a larger group of patients with a PNT flow measurement is needed to confirm the clinical value of this approach.