Machine Learning Models to Predict Postoperative Incontinence after Endoscopic Enucleation of the Prostate for Benign Prostatic Hyperplasia: An EAU-Endourology Study

Artificial intelligence (AI) has gained much interest in recent years for its wide-ranging applications in healthcare. Much effort has been invested into harnessing the predictive power of AI, with the end goal of providing personalized medicine. Essentially, it aims to address the following challenge: given what you know about a specific patient in front of you now, are you able to accurately predict if the patient will experience a particular outcome in the future?

In the field of urology, the outcomes in question are numerous. AI has been applied in stone composition prediction, complications after endourological procedures, oncological outcomes based on genomic and biomarker studies, planning of radiotherapy treatments, cytopathology, and surgical skill evaluation. However, while many research papers in the literature report excellent predictive results using their respective models, few provide an easily accessible and interactive interface for readers to test the model out. This is a crucial deficiency: AI has been around for many years, but it was the release of ChatGPT – an AI model that was accessible through a website and had an idiot-proof messenger-style interface – that catapulted it into a new age of awareness by the general public.

Against this backdrop, we sought to harness AI on a large dataset of patients with benign prostatic hyperplasia (BPH) and bring it to readers in a seamless and accessible manner. Endoscopic enucleation of the prostate (EEP) has numerous benefits over transurethral resection of the prostate (TURP) for BPH, such as lower blood loss and superior improvement of lower urinary tract symptoms, but is conversely associated with a higher postoperative incontinence rate. The predictors of this outcome are not well understood, but accurately predicting the risk of incontinence preoperatively is crucial for personalized patient counselling, managing expectations, and optimizing postoperative management. Hence, this predictive question of whether a patient will have postoperative incontinence after EEP forms the basis of our analysis with AI – specifically, using machine learning (ML) algorithms.

ML is a subset of AI that uses statistical optimization to elucidate patterns within a given dataset. It has found particular success in outcome prediction using supervised learning, wherein its ability to detect subtle patterns not observable to simpler methods, such as logistic regression, lends itself to higher accuracy.
In this study, we harnessed two large BPH registries – the REAP and PEEL databases^1,2 – as our data source. Both were international, multicenter registries of patients with BPH who underwent enucleation; REAP did not restrict prostate size, while PEEL was for patients with prostates ≥80 ml in volume. In total, 3828 patients with complete data were analyzed.

As with most AI problems, we split our data into training and validation datasets. The following characteristics were used as predictors of incontinence: age, prostate volume, preoperative IPSS, QoL score, Qmax, and post-void residual; presence of preoperative indwelling catheter, early apical release (EAR), enucleation type (2-lobe, 3-lobe, or en-bloc), and laser energy type. In our dataset, median age was 68, median prostate volume was 85.5 ml, and 5.4% had a preoperative indwelling catheter. Median preoperative IPSS, QoL, Qmax, and PVRU were 23, 5, 8.4 ml/s, and 70 ml, respectively. The commonest enucleation type was 2-lobe, the commonest energy type was Thulium fiber laser, and EAR was performed in 34.0%.

Six types of ML models were constructed using the training dataset and applied to the validation dataset to assess their accuracy. We found extreme gradient boosting (XGBoost) to be the most effective predictive model for incontinence. In brief, XGBoost works by generating models which are initially weak and finding their weaknesses, then using these as a preset to for the next model in line. The process repeats itself until the error (or “log loss” – a penalty of sorts which measures discrepancy between the predicted and actual data) is minimized. For our model, the overall accuracy was 86.2%, sensitivity was 96.8%, specificity was 23.7%, positive predictive value was 88.2%, and negative predictive value was 55.9%. Essentially, these numbers mean that if a patient is predicted to be incontinent using this model, they are highly likely to be incontinent in real life. However, the converse may not be true: prediction of continence is inaccurate, so a patient who is predicted to be continent could in real life be continent or incontinent. Applied clinically, one could pay closer attention to follow-up with those who are predicted to be incontinent and to manage this possible outcome expediently.

This exercise in ML puts the spotlight on some limitations of AI. In this database, the presence of incontinence of any duration was 15.0%, and that of incontinence lasting longer than 3 months was only 1.6%. Specific complications were also much rarer than incontinence. Hence, only the outcome of incontinence of any duration could be predicted with reasonable accuracy. At cohort sizes similar to ours, ML model performance is much poorer if the outcome to be predicted is highly skewed by class imbalance, as there will be fewer “positive” cases to train the model to differentiate from the vast number of “negative” cases. Prediction of such rare outcomes requires a much larger dataset and is hence reserved for future studies that amass a number of patients that is at least 1 or 2 orders of magnitude larger. Even in such large cohorts, superior model performance is not guaranteed.³ Moreover, our model is only applicable to the ranges of variables available in our original database. The ranges in this study appear rather representative of those encountered in real-world practice, particularly with the large ranges of age (range 27-91) and prostate volume (range 15-539), but caution is required when predicting outcomes at these extremes since there are fewer patients in our original dataset as well. There are also some limitations with the dataset itself. Postoperative incontinence of any duration was the only outcome predicted for in this model, and relied on patient self-reporting rather than objective measures such as bladder diaries or pad tests.

Despite these limitations, this concept of enhanced predictive power is not to be underestimated. With the right dataset and the right question, AI can be applied with greater accuracy than other regression models to any outcome – be it incontinence, complications, survival, or recurrence – and is a step in the right direction towards personalized medicine when used appropriately. By providing a personalized risk assessment for postoperative incontinence, clinicians can better inform patients about potential outcomes, manage expectations, and tailor preoperative discussions. This allows for more proactive patient education and shared decision-making. By stratifying patients based on their predicted risk of incontinence, healthcare resources (e.g., specialized incontinence clinics, pelvic floor physiotherapy referrals) can be more efficiently allocated to those who are most likely to benefit.

Finally, after reading all this, you may ask: What about the accessibility and interactive nature of AI that was promised in the earlier part of this article?

Our ML model for the prediction of postoperative incontinence post-EEP surgery for BPH can be accessed at: https://kyfong.shinyapps.io/glpmlshiny/. Remember the limitations and imperfections of AI, but do not hesitate to play around with it and see if it matches up to your experience in clinic!

Written by: Khi Yung Fong, MBBS, MRCSI, Resident, Department of Urology, Singapore General Hospital, Singapore

References:

Gauhar V, Gómez Sancha F, Enikeev D, Sofer M, Fong KY, Rodríguez Socarrás M, et al. Results from a global multicenter registry of 6193 patients to refine endoscopic anatomical enucleation of the prostate (REAP) by evaluating trends and outcomes and nuances of prostate enucleation in a real-world setting. World J Urol 2023;41:3033-40.
Gauhar V, Castellani D, Herrmann TRW, Gökce MI, Fong KY, Gadzhiev N, et al. Incidence of complications and urinary incontinence following endoscopic enucleation of the prostate in men with a prostate volume of 80 ml and above: results from a multicenter, real-world experience of 2512 patients. World J Urol 2024;42:180.
Herrin J, Abraham NS, Yao X, Noseworthy PA, Inselman J, Shah ND, Ngufor C. Comparative Effectiveness of Machine Learning Approaches for Predicting Gastrointestinal Bleeds in Patients Receiving Antithrombotic Treatment. JAMA Netw Open. 2021 May 3;4(5):e2110703.

Read the Abstract

Login