Introduction
Atrial fibrillation (AF) is the most frequent arrhythmia but often remains unrecognized. Early detection is highly relevant to reducing stroke, heart failure and mortality but represents a major clinical challenge [
1]. Developing new non-invasive methods for early diagnosis of paroxysmal AF (pAF) represents an important task to improve prevention of adverse effects of AF such as stroke, heart failure or cardiomyopathy [
2,
3].
As part of the Apple Heart Study and the Huawei Heart Study, the use of smart watches for AF detection was evaluated in several hundred thousand subjects [
4,
5]. The studies, however, were conducted in unselected populations. Therefore, AF was only detected in 0.5% or 0.2% of the study population. Population-level screening for AF bears the risk of large numbers of false-positive results and might cause unnecessary additional investigations or treatments [
6]. Therefore, screening investigations are particularly valuable in patient groups with increased risk for AF [
6]. Several models were established to predict the risk for developing AF in the next 5 or 10 years based on clinical data and blood biochemistry measures using Cox regression and procedures for selecting predictive parameters [
7‐
12]. Scores developed within the CHARGE-AF consortium were challenged in independent studies based on electronic medical record databases and further developed using machine learning methods [
12,
13].
It is well established that echocardiography is informative about the hemodynamic and mechanical cardiac functions. Several echocardiographic measures are associated with an increased risk for developing AF or AF-dependent complications such as left atrial (LA) enlargement, an increase of the left ventricular (LV) wall thickness or reduced end-diastolic to end-systolic fractional shortening of the LV [
14‐
16]. Screening for pAF in patients undergoing echocardiography is reasonable because the prevalence of pAF in this patient collective is high. In the echocardiographic laboratory of the Department of Cardiology at Heidelberg University (Heidelberg, Germany), a prevalence of about 16% was observed [
17]. Therefore, AF screening in this patient population is appropriate.
Recently, we systematically evaluated the predictive value of various echocardiographic parameters for AF and developed mathematical models and scores for predicting the presence of paroxysmal AF (pAF) using medical history and echocardiographic parameters that can be easily implemented in routine clinical practice [
17]. The derivation cohort contained 47 clinical and echocardiographic parameters from 1000 patients. The optimal score variant includes 12 echocardiographic and medical history parameters that are most predictive for classifying between pAF and sinus rhythm (SR). To further simplify the clinical application, a reduced score with four parameters, age, LA diameter, tissue Doppler imaging velocity during atrial contraction (TDI, A’), and aortic root diameter, was developed.
In this prospective and multicentric study, we tested the pAF prediction scores in 305 patients with unknown AF status by continuous ECG monitoring over a period of up to 21 days. Thereby, we could validate the developed prediction scores, subsequently termed ‘ECHO-AF’ scores, as non-invasive tool for detecting pAF that can be easily implemented in clinical practice and might serve as a screening test to initiate further diagnostic investigations for validating the presence of pAF.
Methods
Study population
Echocardiographic and additional clinical data of 305 patients without diagnosis of AF or atrial flutter (50.2% males, 49.8% females) who were interested in undergoing a screening investigation for AF were included in this study between May 2016 and November 2019 at the Department of Cardiology and the Department of Neurology of the University Hospital of Heidelberg (Germany), and four cardiology practices in Germany (Heidelberg, Lüneburg, Essen). Patient data were collected in a de-identified manner. Based on the observed prevalence of pAF in our echocardiographic laboratory, we estimated that about 300 subjects were necessary to estimate sensitivity values with confidence intervals of ± 10% [
18,
19]. The study protocol was approved by the ethics committee of the University of Heidelberg (Germany, Medical Faculty Heidelberg, S-491/2015). Written informed consent was obtained from all patients, and the study was conducted in accordance with the Declaration of Helsinki.
Collected score parameters consisted of medical history parameters (age, smoker, heart frequency, sleep apnea, hyperlipidemia, type 2 diabetes mellitus, catheter ablation), medication (beta-blocker), and echocardiographic parameters (aortic root diameter, left atrial diameter, left ventricular end-systolic diameter, TDI A’ velocity). Catheter interventions were ablations of accessory pathways or atrioventricular nodal reentry tachycardia. Patients with a history of atrial fibrillation or atrial flutter ablation were excluded.
Echocardiography and long-term Holter ECG
Transthoracic echocardiography examinations were performed on commercially available ultrasound systems (GE Healthcare, Philips, Sony). Images included parasternal, apical and subxiphoidal views using 1.5–4.0 MHz phase-array transducers. All examinations were performed with 2D echocardiography for anatomic imaging and Doppler echocardiography for assessment of velocities. Left atrial size was determined as the maximal distance between the posterior aortic root wall and the posterior left atrial wall at the end of the systole. Aortic root and left ventricular end-systolic diameters (LV, ESD) were obtained in the parasternal long axis view. The TDI A’ velocity was measured in the apical four-chamber view. Patients carried 3-lead 7-day Holter ECG devices (Mortara Instruments) for up to 3 weeks. A few patients carried the device for 4 weeks. ECG recordings were evaluated by a cardiologist blinded to the score parameters and the calculated score values. AF was diagnosed in case AF episodes longer than 20 s were documented.
Statistical methods
Continuous variables between SR and pAF groups were compared using one-way analysis of variance (ANOVA). Standard deviations are indicated by plus–minus signs (Table
1). Categorical variables between groups were compared with two-tailed Fisher exact test. Predictive scores were previously derived from logistic regression models calibrated with 47 parameters recorded in 1000 patients of the Department of Cardiology at Heidelberg University (Heidelberg, Germany) [
17]. Of these 47 parameters, the most predictive 12 parameters were identified by sequential feature selection and likelihood-ratio testing. Scores were scaled between 0 and 100. The 12-parameter score is given by Eq. (
1), and the 4-parameter score by Eq. (
2):
$$\begin{aligned} L_{{12}} & = - 17.07 + 0.3359 \times \frac{{{\text{age}}}}{{\text{y}}} + 0.8700 \times \frac{{{\text{Ao,root}}}}{{{\text{mm}}}} + 0.7512 \times \frac{{{\text{LA}}}}{{{\text{mm}}}} - 0.3331 \times \frac{{{\text{LV,ESD}}}}{{{\text{mm}}}} \\ & - 1.570 \times \frac{{{\text{TDI,A}}^{'} }}{{{\text{cm}}/{\text{s}}}} + 0.1527 \times \frac{{{\text{HF}}}}{{1/\min }} + 10.98 \times {\text{sleep}}\;{\text{apnea}} + 4.172 \times {\text{hyperlipidemia}} - 0.1995 \times {\text{type}}\;{\text{II}}\;{\text{diabetes}} \\ & + 0.7565 \times {\text{smoker}} + 5.307 \times \beta \text{-} {\text{blocker}} + 21.39 \times {\text{catheter}}\;{\text{ablation}}. \\ \end{aligned}$$
(1)
$$L_{4} = - 22.96 + 0.4997 \times \frac{{{\text{age}}}}{{\text{y}}} + 0.9188 \times \frac{{{\text{Ao,root}}}}{{{\text{mm}}}} + 0.9459 \times \frac{{{\text{LA}}}}{{{\text{mm}}}} - 1.583 \times \frac{{{\text{TDI}},{\text{A}}^{'} }}{{{\text{cm}}/{\text{s}}}}.$$
(2)
Table 1
Comparison of score parameters between patient groups
Parameters | | |
Age (years) | 58.7 ± 18.6 | 69.3 ± 12.5** |
Aortic root (mm) | 28.8 ± 6.3 | 34.3 ± 4.5*** |
LA, (mm) | 34.4 ± 7.4 | 40.4 ± 5.5*** |
LV ESD (mm) | 44.3 ± 12.6 | 34.2 ± 10.1*** |
Sleep apnea [n (%)] | 28 (10) | 1(3) |
Hyperlipidemia [n (%)] | 89 (33) | 14(41) |
Diabetes mellitus [n (%)] | 38 (14) | 7 (21) |
Smoker [n (%)] | 62 (23) | 2 (6)* |
Beta-blocker [n (%)] | 110 (41) | 17 (50) |
Catheter ablation [n (%)] | 18 (7) | 6 (18)* |
TDI A’ (cm/s) | 8.2 ± 3.5 | 7.0 ± 4.2 |
Heart rate (1/min) | 73.8 ± 12.6 | 76.8 ± 12.0 |
In Eqs. (
1) and (
2), variables (age; Ao,root, aortic root diameter; LA, LA diameter; LV,ESD, LV end-systolic diameter; TDI,Aʹ, tissue Doppler imaging, late diastolic velocity of mitral annulus;
HF, heart frequency) are divided by their units (y, years; mm, millimetres; cm/s, centimetres per second; 1/min, per minute). Categorical variables (sleep apnea; hyperlipidemia; type II diabetes; smoker; ß-blocker, ß-blocker intake; catheter ablation, status after catheter ablation) are set to 1 or 0 in case of their presence or absence. Using the 12-parameter score, presence of pAF was predicted in case of
\(L_{12} \ge 58.35\), and using the 4-parameter score, pAF was predicted in case of
\(L_{4} \ge 63.32\). An online calculator was created to simplify application of the scores [
20].
For comparison, we assessed the predictive performance of logistic regression models containing only subsets of variables that are part of the 12-parameter score. We tested the following reduced models: (1) a model, reduced by all echocardiographic parameters, with parameters age, heart frequency, sleep apnea, hyperlipidemia, type II diabetes, smoker, beta-blocker, catheter ablation, (2) a model containing the parameters
age,
gender,
BMI, and (3) a model containing only the parameter age. Parameters of these reduced models were obtained from calibration of logistic regression models based on the dataset that was previously used to establish the 4-parameter and 12-parameter scores [
17]. ROC curves of these models were obtained as previously described using 100-fold cross-validation [
17]. We further compared the predictive performance of our scores with two previous prediction scores, the HAVOC and the ACTEL scores [
21,
22]. These two scores were developed to predict AF in patients with cryptogenic stroke or transient ischemic attack (TIA) and use non-invasive clinical parameters as well. In these scores, 95% confidence intervals were estimated by bootstrapping with
n = 1000 samples.
To assess classification performance, area under the curve (AUC) values, sensitivities, specificities and precisions were analysed. We estimated 95% confidence intervals for sensitivity, specificity, precision and AUC values by bootstrapping with n = 1000 samples. All analyses were performed based on pre-implemented functions and custom scripts in MATLAB (MathWorks).
Discussion
In this study, the capabilities of the developed 12-parameter and 4-parameter ECHO-AF scores for pAF prediction could be prospectively validated. In 34 study patients, pAF could be newly diagnosed. Further diagnostic validation and a positive CHA2DS2-VASc score confirmed that oral anticoagulation was necessary in these patients.
The moderate precision values (12-parameter score: 34%, 4-parameter score: 24%) result from the relatively small fraction of pAF patients (11%) relative to the fraction of SR patients in the study sample. Therefore, the scores cannot replace documentation of AF by ECG measurements but represent highly sensitive screening tests to select patients for further diagnostic validation of the presence of pAF by long-term Holter ECG measurements. Our investigation showed that the scores can be used to select patients, in which a long-term ECG monitoring should be conducted. For example, a positive score result could be coupled to further screening tools such as carrying a smart watch that can measure photoplethysmography signals or long-term Holter ECG monitoring. Using the scores to narrow the patient group, in which further AF screening is performed, can lead to a cost optimization [
6].
Previously, several clinical studies defined AF prediction scores based on blood serum parameters [
8‐
12,
23]. In contrast, our study focused on non-invasive clinical parameters, particularly on echocardiographic parameters. These are often available in cardiology practices, whereas, several serum biomarkers predictive for AF are frequently unavailable in patients not treated in a hospital setting. Furthermore, echocardiographic parameters reflecting measures of the cardiac anatomy, blood flow and tissue velocities are immediate indicators of the cardiac function. Therefore, it is likely that these are predictive for arrhythmias with pathophysiological consequences. Comparing the developed scores with variants without imaging parameters underlined that echocardiographic parameters are important for predicting the presence of AF.
We compared the ECHO-AF scores with the previously established HAVOC and ACTEL scores. ROC curves of these scores showed comparably small AUC values compared to the ECHO-AF scores. This discrepancy can be probably attributed to differences in patient collectives that were taken into account—the HAVOC and ACTEL scores were developed using clinical data of patients after stroke or TIA. In these patients, the prevalence of AF was substantially higher compared to our study population. As in case of the HAVOC and ACTEL scores, the C2HEST score represents an easy applicable tool for predicting AF by summing up integer numbers associated with the risk factors coronary artery disease, COPD, hypertension, age above 75 years, systolic heart failure and hyperthyroidism [
26]. An internal validation of the score showed an AUC value of 0.75, whereas the external application showed an AUC value of 0.65, which is modest as compared to the ECHO-AF scores evaluated in this study. In general, scores calculated from summing up integer numbers associated with clinical features can be easily calculated but are less precise than scores represented by more complex equations, as in case of the ECHO-AF scores.
In contrast to the Apple Heart Study and the Huawei Heart Study, this screening trial was performed in a more selective patient cohort with higher pAF prevalence [
4,
5]. In this context, it should be noted that the detection of an irregular rhythm by the algorithm used in a smart watch is not equivalent to the detection of AF, which has consequences for the false-positive detection rate of smart watches. It is important to note that rare arrhythmia episodes detected by a smart watch do not imply the same stroke risk as clinically diagnosed AF.
Other scores were developed to predict the 5-year or 10-year risk for developing pAF that require parameters measured by cardiac computed tomography or specialized laboratory tests (NT-proBNP, troponin T) [
10,
12]. In contrast to these diagnostic procedures, echocardiography is routinely performed in many cardiological patients. Applying the developed pAF prediction scores in this patient population further increases the clinical value of echocardiographic parameters. Analysing cumulative fractions of AF diagnoses depending on the ECG monitoring duration indicated that an extended ECG monitoring for more than 1 week is required to reliably test for presence of pAF in accordance with previous long-term ECG monitoring studies [
27,
28].
This study has limitations that have to be acknowledged. First, carrying durations strongly varied between subjects. Therefore, several cases of pAF might have been unrecognized. Our scores were evaluated in patients of different age groups, with different health profiles, and in hospital patients as well as in outpatient setting, which supports its general applicability. More selected subgroup analyses could improve the performance of the prediction tool. For example, subanalyses could be conducted to compare between hospital patients and outpatients, patients without or after thromboembolic events, or without and after heart surgical interventions. Additional parameters that were not included in the 47 parameters used to develop the pAF prediction scores, such as biomarkers (BNP, FGF-23, GDF-15), could further increase predictive performance [
17,
23]. Validation of our developed pAF prediction scores in a larger study population would further reduce uncertainties of the parameters describing the predictive performance. An independent external prospective validation will be important for testing the general applicability of the ECHO-AF scores.
Conclusion
The novel risk prediction scores adequately predicted pAF based on variables readily available during routine cardiac check-up and echocardiography. Thereby, the value of clinical parameters that are often known in cardiological patients can be increased and the early detection of pAF can be improved. Screening for pAF in patients undergoing echocardiography is reasonable because of a comparably high prevalence of pAF in this patient group. To simplify application of the scores, an online calculator is provided [
20]. Collectively, the developed model scores represent a simple, highly sensitive and non-invasive tool for detecting pAF that can be easily implemented in clinical practice and might serve as a screening test to initiate further diagnostic investigations for documenting the presence of pAF.
Compliance with ethical standards
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/.