AiFURS was developed using 6,170 annotated ureteroscopic frames representing 11,870 labeled stones, making it one of the largest endoscopic datasets for kidney stone detection to date. The system incorporates the YOLOv11-N architecture, a modern deep-learning object detection model designed to identify visual targets in each video frame. It also utilizes BoT-SORT, an efficient multi-object tracking algorithm that maintains the identity of each stone throughout continuous surgical videos, enabling reliable detection and tracking at nearly native frame rates (~20 fps). Additionally, the system features a pixel-to-millimeter conversion method referencing the laser fiber to calculate stone size in real time, reducing reliance on subjective visual assessment. The complete AiFURS workflow was evaluated step-by-step through ex vivo testing, in vivo validation in 100 cases, external validation in 80 cases, and direct performance comparisons between urologists and the AI system.

Figure 1. AiFURS development and validation pipeline, from ureteroscopic data collection to ex vivo, in vivo, and external performance evaluation.
Across ex vivo, in vivo, and external validations, AiFURS demonstrated high diagnostic accuracy for stone identification, classification, and size measurement. The system showed a strong correlation with caliper measurements for stones larger than 2 mm, a clinically significant threshold, and accurately predicted stone composition, outperforming visual assessments by 20 experienced urologists. These results highlight the limitations of purely visual interpretation during FURS and emphasize the potential of AI to enhance the surgeon’s intraoperative perception. This study also demonstrates that intraoperative RF metrics detected by AiFURS can serve as prognostic markers for the need for secondary procedures. Logistic regression analysis revealed that the proportion of RFs greater than 2 mm, rather than the total count, was strongly associated with reoperation risk. This introduces a new, actionable intraoperative marker. By providing consistent, objective feedback on fragment size and burden, AiFURS has the potential to improve surgical precision, optimize laser lithotripsy strategies, and help surgeons determine an appropriate surgical endpoint in real time. The system highlights the broader potential of real-time AI in endourology: standardizing stone assessment, reducing variability among surgeons and centers, and ultimately reducing the risk of incomplete fragmentation and the need for postoperative intervention. Although the results are promising, further work is needed. This initial study included single-component stones and excluded visually challenging cases, such as bleeding or severe inflammation. To improve generalizability, we have initiated AI-STONE-RCT, a multi-center prospective trial enrolling 500 patients to evaluate system performance across various surgical conditions.
AiFURS represents a significant move toward AI-enabled precision surgery. By providing real-time, quantitative insights that go beyond subjective evaluation, it helps establish a more reliable and standardized framework for kidney stone treatment.
Written by: Chenfeng Wang,1,2 Haomin Liang,3 Hairui Chen,1 Rashid Khan,3,4 Donglai Shen,1 Haitao Liu,1 Dan Shen,1 Wei Wang,1 Jianwen Liu,1 Frédéric Panthier,5 Min Zhao,6 Xu Zhang,1 Bingding Huang,3 and Haixing Mai1
- Department of Urology, The Third Medical Center, Chinese PLA General Hospital, Beijing, China
- Department of Urology, The Seventy-third Group Army Hospital, Xiamen, China
- College of Big Data and Internet, Shenzhen Technology University, Shenzhen, China
- College of Engineering Physics, Shenzhen Technology University, Shenzhen, China
- Sorbonne University GRC Urolithiasis No. 20, Tenon Hospital, Paris, France
- College of Mechanical and Electrical Engineering, Guangdong University of Science and Technology, Dongguan, China