Introduction
Metabolic and bariatric surgery (MBS) is currently the most effective long-term treatment of severe obesity with or without associated medical conditions. Despite the low rates of complications after MBS, gastrointestinal leaks are concerning adverse events, with levels of <1.3% and <0.15% after primary Roux-en-Y gastric bypass (RYGB) and sleeve gastrectomy (SG), respectively, and a 2.4% risk for all new MBS [
1,
2]. Leakage remains an important cause of post-MBS morbidity and mortality [
3‐
5].
Risk factors associated with leak include higher BMI, staple height and use of buttressing material did not affect leak rate, size ≥40-Fr bougie was associated with less leak, and longer time to referral and lower serum prealbumin level were independently associated with poorer evolution of post-SG gastric leak under conservative management [
6].
Management of post-MBS leaks involves surgical, endoscopic, and/or radiological options [
7]. Surgery is associated with significant morbidity/mortality, endoscopy offers significant benefits for selected patients [
8,
9], and stent placement is less invasive than surgery. A challenge in managing post-MBS leaks is the scarcity of well-designed studies to guide evidence-based treatment [
10].
The literature reveals knowledge gaps. Stents seal the defect and divert luminal content, allowing mucosal wall healing, early oral intake, and reduced stricture formation [
11]. However, post-MBS stent research recruited heterogeneous patients, examined stent efficacy for a variety of indications, lacked standardization, and revealed practice discrepancies, and several classification systems for post-MBS leaks exist, with no universal adoption or comprehensive validation [
12]. Hence, endoscopic techniques used to manage upper gastrointestinal anastomotic leaks lack consensus on the most appropriate therapeutic approach [
13] and no solid evidence to guide optimal stent treatment for the best outcomes [
10,
14]. With such uncertainty, a predictive tool is needed to identify the patients with post-MBS leaks who are likely to fail stent therapy. This is important, as leaks are life-threatening [
15,
16].
Anticipating the risk of stent failure is critical to individualized treatment approaches that involve selection of optimal patients, MBS type, BMI category, associated medical conditions, leak site/size, and stent type/length. A one-size-fits-all approach does not exist, and successful management requires a tailored approach premised on clinical parameters, surgical features, local expertise, and availability of devices [
17].
Hence, a clinically feasible stent failure risk prediction model is critical to accurately forecast stent failure risk for post-MBS leaks, devise preventive strategies, and reduce mortality. To the best of our knowledge, no previous research undertook such a task.
Therefore, the aim of this study was to develop and externally validate a machine learning (ML)–based risk model (Alexandria-Bari-Stent) to predict stent failure in post-MBS leaks. The specific objectives were to
1.
Develop a clinical predictive model for post-MBS stent outcomes employing patients from one MBS center (development sample).
2.
Identify the demographic, surgical, clinical, leak, and stent-related variables associated with stent outcomes (predictors).
3.
Compare the performance of ML algorithms and select the most appropriate model.
4.
Evaluate the final model’s performance on an external validation dataset.
5.
Assess the model’s permutation-based feature importance.
6.
Convert the model’s coefficients into a point-based scoring system for predicting stent outcomes.
7.
Externally validate the risk score model on another 150 patients (external validation sample).
8.
Calibrate the performance of the point-based risk score model in the external validation dataset.
9.
Appraise the model’s clinical utility using decision curve analysis (DCA).
We also sought to generate a flowchart of the management outcomes of the 400 post-MBS leakages.
Discussion
This study developed and externally validated a novel ML-based risk score (Alexandria-Bari-Stent) to predict stent failure in post–bariatric leaks. The model demonstrated excellent discrimination (AUROC 0.85) and calibration in an independent cohort identifying high-risk patients. It was particularly effective at ruling out potential stent failure: patients classified as low failure risk had ≈91% NPV, indicating that the model reliably identified individuals unlikely to experience failure. This high NPV is clinically valuable, supporting more confident decision-making in deferring unnecessary interventions or follow-up testing for low-risk patients. This externally validated tool (Alexandria-Bari-Stent) is the first in the MBS leak setting.
Post-MBS leakage ensues without recognizable technical problems during the procedure and is the second cause of postoperative mortality [
26,
27]. Anticipating stent failure risk is key to individualized treatment. However, there are no clinically feasible post-MBS stent failure risk prediction models to accurately forecast failure, no consensus on the endoscopic management of such leaks, and no data to support a precise algorithm [
28,
29].
In response, the current study developed the Alexandria-Bari-Stent, a clinical post-MBS ML-based risk model to predict stent failure for leaks, employing 250 patients from one MBS center, and externally validated on 150 patients from another center. We also evaluated the model’s discrimination, calibrated it against observed outcomes, and assessed its clinical utility using DCA.
Our main findings were that the significant predictors of post-MBS stent failure comprised eight high contributors (OSA, hypertension, diabetes, hepatomegaly, hyperlipidemia, BMI, Niti-S18 stent, GJ anastomosis leak) and nine features with varying contributions (revisional surgery, Niti-S23 stent, time to stent implantation, leak size >1 cm, age, RYGB surgery, EGJ leak, Hanaro 21 stent, male sex). External validation demonstrated the diagnostic performance in predicting stent failure (0.85 AUROC and 0.81 AUCPR), indicating good discriminative ability and strong performance in identifying true stent failure cases while accounting for class imbalance. The model’s 80.0% sensitivity and 66.7% PPV indicated reasonable ability in identifying patients at risk of stent failure. Its 82.9% specificity and 90.6% NPV meant it was particularly effective at identifying patients unlikely to fail. Clinically, the model is more reliable for ruling out stent failure than for confirming it, especially useful in reassuring low-risk leakage patients. The calibration illustrated reasonable agreement between the predicted and observed failure rates, with a Brier score of 0.15 indicating acceptable calibration accuracy, and improved calibration as predicted risk increased, particularly in the mid-to-high probability range.
The absence of predictive models of post-MBS stent failure for leak renders head-to-head comparisons of our model vis-a-vis others unfeasible. However, examination of other published ML-based MBS prediction models highlights the favorable findings of the current study, demonstrating its rigorous approach to the development, testing, and calibration of the Alexandria-Bari-Stent’s diagnostic performance.
Existing applications of ML in MBS have reported moderate predictive power for various outcomes, e.g., AUROCs of 0.77 for liver fibrosis in severe obesity [
30], 0.65–0.7 and 0.64–0.68 for postoperative complications [
31,
32], 61–64% for complications in conversion surgery [
33], and 0.67–0.78 for readmissions [
34‐
36]. Our model’s performance (AUROC 0.85, 95% CI: 0.76–0.93) was notably high and, more importantly, one of the very few with true external validation.
The current study externally validated the risk score model on stent patients from another center. Despite that external validation is the “ideal,” many ML studies in MBS did not undertake or report it, conducting internal validation instead [
30‐
32,
34‐
40]. Internal validation is insufficient to substantiate that a model that successfully predicts the outcome of interest is valuable or applicable to new individuals, particularly since predictive models tend to predict observations in the derived dataset more accurately than in new data [
41,
42]. A further attestation to the robustness of the current study’s model is our higher accuracy compared to other published ML models, despite the fact that we externally validated the model when others undertook internal validation [
30,
31,
37].
Few if any bariatric ML studies reported calibration [
30‐
32,
34‐
40,
43]. We demonstrated acceptable calibration (Brier ~0.15), meaning that the predicted risk corresponds well to actual risk, which is important for a clinical decision-making tool and critical in clinical prediction models [
44].
The mean age, BMI, and female majority of the patients employed in our model development concur with a meta‐analysis of stent management for post-MBS leaks [
1], as well as with studies of post-SG leaks undergoing endoscopic stenting [
45]. Our observed 1.2% mortality was close to the 1.4% mortality reported elsewhere [
27].
Few studies assessed the characteristics associated with stent failure. We found associated medical problems (diabetes, hypertension, hyperlipidemia, OSA) to be strongly associated with stent failure. Similarly, others observed higher complication rates after bariatric stenting among patients with diabetes, reaching up to a fourfold increased risk [
21,
46]. However, a smaller series did not find diabetes predictive [
43]. Endoscopic stents simultaneously manage leaks and strictures when present [
21,
47‐
49], and the self-expandable metal stents we used are designed with the capability of leak sealing and stricture dilation functions.
We noted that BMI influenced stent failure, concurring with that stent migration was significantly more frequent with higher BMI [
46]. Although some studies reported that age was not associated with failure [
21,
50], males were more represented in our failure cohort. While sex has not been widely reported as a failure factor [
21,
50], this finding might reflect underlying risk profiles and warrant further investigation.
The time to stent implantation was a contributor to failure in the current series, reinforcing the importance of early intervention [
4,
17,
46,
51]. Within our study, each additional day of delay increased the failure odds by ~17%. Timely stent placement is a key modifiable factor for success.
Previous studies have not consistently linked the type of surgery to stent outcomes. Our analysis found that RYGB was associated with a modestly lower risk of stent failure in both univariable and multivariable models, although the association was not statistically significant [
50]. Interestingly, our model suggested that stent use for leaks occurring after revisional procedures might have a higher failure propensity, probably due to more complex anatomy or ensuing fibrosis. However, with limited literature, this needs corroboration.
We found that some stents (Niti-S18 and, to a lesser extent, Niti-S23 and Hanaro 21) were associated with failure. Some authors have advocated for larger “mega” stents to improve success [
10], but our findings, similar to others, suggest that bigger is not necessarily better [
29]. Stent choice in our study was not randomized, and optimal stent choice depends on the patient’s anatomy, leak characteristics, and availability, underscoring the need for further research on stent design and selection.
Pertaining to stent failure rates, among our combined development and validation cohorts (
N = 400), placement was successful in 232 cases (58% primary closure) and failed in 168 cases (42%); 74% of the failures were because of stent migration (125 patients). A challenge when comparing post-MBS stent failure rates across studies is how success/failure is defined and reported and whether success/failure is calculated based on the first stenting or after multiple subsequent endoscopic maneuvers that ultimately result in success (after initial failure) [
52]. We selected a stringent definition of failure (failure to resolve leak with the initial stent) to avoid conflating outcomes of additional interventions. Systematic reviews of stent management of post-MBS leaks reported failure and success rates, but with unclarity whether these were calculated based on first (initial) or further (subsequent) stenting [
1]. Standardized definitions of stent success or failure as well as their preferred reporting are required for valid comparisons across studies, MBS techniques, time, and countries [
52].
An observation that the current study noted is the importance of the patient’s general health status to the outcomes of post-MBS stent, above the traditional leak-related factors and delayed diagnosis/treatment. Some predictors of failure we noted pertained to the medical problems including OSA, hypertension, diabetes, hyperlipidemia, and hepatomegaly. With no published prediction models, we are unable to compare our findings; however, low nutritional reserves impair wound healing, and these conditions are all components of the metabolic syndrome and obesity-related co-morbidity cluster. Mechanistic hypotheses for such associations are plausible: OSA causes chronic intermittent hypoxia and poor sleep, impairing wound healing; diabetes/associated hyperglycemia leads to poor circulation and immune dysfunction, slowing tissue repair; hypertension is accompanied by vascular changes that reduce tissue perfusion and is often part of a broader metabolic syndrome; hyperlipidemia contributes to a pro-inflammatory state; elevated triglycerides can undermine postoperative healing by promoting systemic inflammation; and hepatomegaly and advanced obesity/metabolic syndrome are potentially associated with chronic inflammation and coagulopathy that could hinder leak resolution [
6,
53‐
63].
Our model’s top predictors are complementary to, not in conflict with, established leak management principles. The variability in patients’ comorbid profiles could have contributed to the model capitalizing on it for prediction. Conversely, nearly all our patients received relatively prompt treatment (median ~20 days from diagnosis), providing the model with less variability to capitalize upon when predicting. In summary, metabolic health mattered in addition to standard care, and host factors are important in leak outcomes in addition to established technical and local factors. In this sense, the model places a spotlight on the patient’s baseline condition that can tilt the balance between success and failure of the same treatment, and that more aggressive adjunctive therapy of associated medical conditions can support the healing of leaks. A more holistic view is required, with a focus not only on the leak but also on patient optimization, as systemic factors play a larger role than previously recognized. This metabolic stabilization and the medically optimized patient as part of comprehensive care that the current model uncovered represent a shift from primary attention on the leak itself, as the perfusion of surrounding tissues, general patient condition/nutrition, and infection affect the healing of post-MBS leaks [
19].
The current study has limitations. The study is retrospective and non-randomized; patients were from two centers, stent choice was not randomized, and patient characteristics and postoperative management may not reflect practices at other centers using different stents or management protocols. Hence, generalizing the model’s performance and findings to wider populations needs to be cautious. Hanaro 18 and 21 stents were excluded from the regression model as they did not have a minimum of 10 failures each. The inclusion of other variables that could influence stent outcomes would have been useful, e.g., nutritional parameters (e.g., albumin/prealbumin), sepsis at presentation, bougie size at index operation (tighter sleeve [small bougie] could predispose to leaks), presence of distal obstruction, and endoscopist’s decision-making or stent allocation strategy. We did not systematically document concurrent endoscopic interventions such as balloon dilation for strictures; however, the stents we used can manage leaks and strictures simultaneously, which may have influenced our success rates. The absence of a standardized, validated leak classification system during our study period limited our ability to apply established criteria, highlighting a need for practical and validated classification systems that correlate with clinical outcomes. In addition, nutritional management protocols varied between centers, with decisions individualized based on leak characteristics, patient stability, and institutional preferences reflecting real-world clinical practice. The specific breakdown of TPN versus enteral feeding approaches was not systematically documented in our retrospective analysis. Finally, our model is intended for use in patients selected for endoscopic stent management of leak; it does not address which leaks should be stented or initially managed surgically. As all patients in our study received a stent, the score presumes an initial decision to stent has been made. It can help identify when a stent is likely to fail—but it does not replace clinical judgment in the initial choice of therapy. A point-based score, while convenient, is a simplification. There may be some loss of granularity, and it might not perfectly calibrate in every setting; thus, further prospective validation is needed. Future studies should validate the risk score in larger, multi-center cohorts across different regions to ensure broad applicability and incorporate the Alexandria-Bari-Stent tool into prospective decision-making protocols, possibly testing it in a clinical trial where high-risk patients are triaged to alternative strategies to demonstrate impact on patient outcomes. Prospective iterations should incorporate nutritional markers and operative technique details to further enhance the predictive modeling and document all concurrent interventions to better characterize the full scope of endoscopic management.
Despite these limitations, the study has many strengths. We analyzed the demographic, surgical, clinical, leak, and stent-related factors associated with stent failure; compared the performance of ML algorithms, selecting the most appropriate; evaluated the performance of the Lasso model on the external validation dataset; assessed the model’s permutation-based feature importance; converted the Lasso coefficients into a point-based scoring system for predicting stent failure; externally validated the risk score model on another 150 patients; calibrated the performance of the point-based risk score model in the external validation dataset; and appraised its clinical utility using DCA. We also generated a flowchart of the clinical outcomes of stent placement and subsequent management after failure across 400 post-MBS leaks. To our knowledge, this is the first study to undertake this comprehensive task. To expand on the clinical utility of the Alexandria-Bari-Stent tool, we are developing a multitask platform that would help surgeons in clinical decision-making to reduce stent failure risk.