As of December 2025, the FDA has approved 1451 AI-enabled devices, 1104 in radiology.1 Only 3 of these, all in the area of diabetic retinopathy, are intended to work autonomously; all others are meant to augment rather than replace human clinician workflows. As in most domains, prostate cancer lags well behind breast cancer and behind colorectal and lung cancer as well.2 Clearly, however, we stand at an inflection point in prostate cancer, with a bevy of tools in varying stages of development and promulgation, mostly in the areas of pathology and radiology.
Emerging AI tools in these areas work on images—either high resolution scans of pathology H&E slides or imaging DICOM files—using quite a similar algorithmic approach that Google Images and similar tools use to distinguish a cat from a cupcake. In simple terms, the software builds up a layered “mental map” of a given image from pixels to vectors to shapes and higher-order features (circles, pupils, frosting, eyes, whiskers, sprinkles) and compares its conclusion against a human-validated gold-standard, refining the model iteratively until accuracy is optimized. In pathology, the task is more complex, with models integrating information across thousands of image regions and learning from labels that are themselves imperfect. The AI considers over 100 interpretive layers representing different aspects of the image—and in most cases, even the AI architects do not fully understand what is actually contained in these layers. Clinical tools work quite similarly, with the goal being separating, rather than pets from baked goods, cancer from normal, high-grade from low-grade, lethal from non-lethal, or some other human-defined outcome.
In terms of pathology AI, the early leader, thanks in large part to unique access to multiple NRG trial tissue archives, is ArteraAI. The Artera test is termed a “multimodal artificial intelligence” (MMAI) tool, explicitly incorporating clinical variables in addition to pathology imaging into the iterative algorithm development process. As such, the incremental value added over standard clinical risk assessment is harder to quantify than for existing genomic or imaging tests, for example. Furthermore, standard pathology workflows are raising the baseline bar much higher with the new standardized Canary classification,3,4 which classifies patients in terms of risk more clearly and consistently than traditional Gleason scoring5 —without requiring additional software.
That said, the MMAI has established itself as a rare example of a test that clears the bar for predictive rather than simply prognostic: the MMAI can explicitly guide decisions about the addition of androgen deprivation to radiation therapy. It may also be independently prognostic in other contexts, but so far, in this context, it has cleared only the low bar of improving over the outdated and inadequate NCCN risk classification. Other tests are emerging close on Artera’s heels: pathology AI systems from Pathomiq and AIRAMatrix, for example, do appear to improve risk stratification for active surveillance in particular over existing multimodal tests. In a particularly interesting trend, pathology AI companies and genomics companies are beginning to pair up, and it seems quite likely that combining both modalities will yield better accuracy than either one alone.
In the meantime, the PANDA (Prostate cANcer graDe Assessment) challenge has made over 10,000 scanned biopsies publicly available at grand-challenge.org, drawing over 1000 teams from 65 countries. The top submissions were combined and refined to yield algorithms with very high k (agreement) scores of 0.87 with expert genitourinary pathologists for Gleason score assignment6 (higher, in fact, than expert pathologists with each other4). PANDA, publicly funded, is open-source and free.
In radiology, the questions are generally more focused than in pathology, specifically aiming to address the consistently limited reliability of contemporary multiparametric MRI in predicting which patients are at risk of clinically significant prostate cancer, even given experienced eyes.7-9 The problem is not the imaging test itself; rather, it is how the test is interpreted under the PI-RADS 2.1 classification system. A patient undergoes a lengthy and perhaps anxiety-provoking multi-parametric exam, which the scanner then Fourier transforms and simplifies into grayscale images. The human radiologist examines these and applies a mix of quantitative and qualitative criteria summarized to a simplified 1 to 5 scale, which we then dichotomize for clinical decision making. The information loss from gigabytes of magnetic data to a one-bit binary outcome is massive and represents truly an ideal use case for AI.
Enter the Prostate Imaging–Cancer Artificial Intelligence (PI-CAI) challenge. Using the same open-source grand-challenge.org platform as PANDA, the PI-CAI team uploaded over 10,000 bpMRIs and allowed any team to download samples of 1500 for training and 1000 for testing, together with histopathology outcomes. The response was remarkable: 839 individuals from 53 countries submitted 293 AI algorithms; the PI-CAI team merged and refined the top 5 (from 5 different countries on 4 continents) to generate a final model. The human competition—62 experienced radiologists from 45 centers in 20 countries—frankly were outgunned even at this early version: their pooled AUC for predicting clinically-significant disease (not for assigning a PI-RADS score) was a respectable 0.86, but PI-CAI (which is very much version 1.0) achieved 0.91, finding 9 more clinically significant cancers and 65 fewer false positives (including 8 grade group 1 lesions).10
In a recent update determining whether PI-CAI could help radiologists assign PI-RADS scores (again based on bpMRI), unaided, experienced readers reached an AUC of 0.88. PI-CAI helped push their AUC to 0.92. However, PI-CAI alone without human input achieved an AUC of 0.95.11 So are prostate radiologists out of a job? Of course not. Not yet. Much more validation is needed, and there is no regulatory framework to even guide an unassisted algorithm. However, the horizon may be quite limited for a generalist radiologist, in particular to continue to assign PI-RADS scores without guidance from an AI algorithm—particularly an open-source one. To this end, the PI-RADS Steering Committee recently published a framework for prostate MRI AI development and reporting,12 which will be a key resource for guiding future development of these tools.
Meanwhile, commercial MRI-AI tools are entering the clinic with little public benchmarking. Quantib Prostate is FDA-cleared to automate segmentation, lesion detection, and structured reporting and integrates into PACS. It appears to improve a novice radiologist’s performance to rival that of an expert, but only a very small number of publications have modest cohorts.13 Lucida Medical’s Pi platform offers similar features and has been validated in NHS practice to perform on par with human experts. Siemens’ AI-Rad Companion for prostate MRI is embedded directly into syngo.via and includes tools for lesion detection and biopsy planning. Each of these tools may offer value—but comparative data are scarce, and integration into actual clinical workflows remains limited.
New MRI sequences will soon change the input data, adding another layer of complexity. Restriction Spectrum Imaging (RSI) may improve detection of aggressive disease, though it is not yet widely adopted. Hyperpolarized 13C magnetic resonance spectroscopic imaging (MRSI) is further afield—promising, but more costly and logistically complex. Any diagnostic AI tool built for today’s inputs may find itself outdated quickly, which underscores the need for nimble update pathways and transparent retraining standards.
PET/CT is likely next on deck. The aPROMISE platform, developed by Exini Diagnostics and commercialized as PYLARIFY AI by Lantheus, is FDA-cleared to automatically segment PSMA-avid lesions and quantify tumor burden from PET/CT. It has demonstrated solid performance in analyses linked to trials like CONDOR and is beginning to enter routine practice. Here too, the absence of public data and comparative benchmarking is notable; one study found aPROMISE could shave several minutes off the time a radiologist spent reading a scan, at the expense of sensitivity in specific areas such as liver metastases.14
Betting on open-source tools over commercial ones is always a tough proposition in terms of ultimate clinical application, but the openness, transparency, reproducibility, and global buy-in that characterize the PANDA and PI-CAI initiatives are highly appealing. If nothing else, they should keep fires burning brightly under the commercial purveyors to continue to innovate, validate, and prove true clinical and public value of their tests. Hopefully, the coming years will also see substantial effort put into explainability studies. Clinical models are optimized to endpoints essentially agnostic to what features drive the models, and what they are “seeing” is usually somewhat of a black box. By definition, though, whatever visual patterns appear must reflect some underlying biology, and elucidating that will be the focus of important future work.
The pace of progress in this area is so rapid that the future, even three years down the road, is nearly impossible to predict. More tests will definitely enter the market. Some will likely evolve to be formally incorporated into radiology and pathology workflows, but whether as first readers for triage, second readers for confirmation, or as integrated assistants is not clear. AI clinical reads without human supervision is likely still a ways off for many reasons. Radiation oncology will face a similar wave,15 and eventually robots will also be used for surgery,16 though this will, of course, be further downstream. Urology needs to lead as the space develops, insisting on rigorous validation and continuously re-assessing the accuracy, clinical impact, and value of these tools. We should advocate for transparency, head-to-head comparisons, and, whenever possible, the use of open-source tools. As the clinicians ultimately responsible for our patients’ decisions and outcomes, we should assert a clear voice in shaping our AI-enabled future.
Written by: Matthew R. Cooperberg, MD, MPH, Professor of Urology, Epidemiology & Biostatistics, Helen Diller Family Chair in Urology, UCSF Department of Urology, University of California, San Francisco, California
- US Food and Drug Administration. 2025. Artificial Intelligence-Enabled Medical Devices. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-enabled-medical-devices.
- McNamara SL, Yi PH, Lotter W. The clinician-AI interface: intended use and explainability in FDA-cleared AI devices for medical image interpretation. npj Digit Med. 2024;7(1):80. doi:10.1038/s41746-024-01080-1
- Nguyen JK, Harik LR, Klein EA, et al. Proposal for an optimised definition of adverse pathology (unfavourable histology) that predicts metastatic risk in prostatic adenocarcinoma independent of grade group and pathological stage. Histopathology. Published online 2024. doi:10.1111/his.15231
- Ding CC, Xiao H, Nguyen JK, et al. Classification of prostatic adenocarcinoma as favourable/unfavourable histology has high interobserver agreement in prostate needle core biopsies. Histopathology. Published online 2025. doi:10.1111/his.15525
- McKenney JK, Simko J, Bonham M, et al. The potential impact of reproducibility of Gleason grading in men with early stage prostate cancer managed by active surveillance: a multi-institutional study. J Urol. 2011;186(2):465-469. doi:10.1016/j.juro.2011.03.115
- Bulten W, Kartasalo K, Chen PHC, et al. Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nat Med. 2022;28(1):154-163. doi:10.1038/s41591-021-01620-2
- Sonn GA, Fan RE, Ghanouni P, et al. Prostate Magnetic Resonance Imaging Interpretation Varies Substantially Across Radiologists. European Urology Focus. 2019;5(4):592-599. doi:10.1016/j.euf.2017.11.010
- Greer MD, Brown AM, Shih JH, et al. Accuracy and agreement of PIRADSv2 for prostate cancer mpMRI: A multireader study. J Magn Reson Imaging. 2017;45(2):579-585. doi:10.1002/jmri.25372
- Westphalen AC, McCulloch CE, Anaokar JM, et al. Variability of the Positive Predictive Value of PI-RADS for Prostate MRI across 26 Centers: Experience of the Society of Abdominal Radiology Prostate Cancer Disease-focused Panel. Radiology. 2020;296(1):76-84. doi:10.1148/radiol.2020190646
- Saha A, Bosma JS, Twilt JJ, et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): an international, paired, non-inferiority, confirmatory study. Lancet Oncol. 2024;25(7):879-887. doi:10.1016/s1470-2045(24)00220-1
- Twilt JJ, Saha A, Bosma JS, et al. AI-Assisted vs Unassisted Identification of Prostate Cancer in Magnetic Resonance Images. JAMA Netw Open. 2025;8(6):e2515672. doi:10.1001/jamanetworkopen.2025.15672
- Turkbey B, Huisman H, Fedorov A, et al. Requirements for AI Development and Reporting for MRI Prostate Cancer Detection in Biopsy-Naive Men: PI-RADS Steering Committee, Version 1.0. Radiology. 2025;315(1):e240140. doi:10.1148/radiol.240140
- Faiella E, Vertulli D, Esperto F, et al. Quantib Prostate Compared to an Expert Radiologist for the Diagnosis of Prostate Cancer on mpMRI: A Single-Center Preliminary Study. Tomography. 2022;8(4):2010-2019. doi:10.3390/tomography8040168
- Enei Y, Yanagisawa T, Okada A, et al. Comparison of diagnostic performance between manual diagnosis following PROMISE V2 and aPROMISE utilizing Ga/F-PSMA PET/CT. Ann Nucl Med. Published online 2025:1-9. doi:10.1007/s12149-025-02086-9
- Preziosi F, Boschetti A, Catucci F, et al. AI-driven online adaptive radiotherapy in prostate cancer treatment: considerations on activity time and dosimetric benefits. Radiat Oncol. 2025;20(1):116. doi:10.1186/s13014-025-02697-6
- Wah JNK. The rise of robotics and AI-assisted surgery in modern healthcare. J Robot Surg. 2025;19(1):311. doi:10.1007/s11701-025-02485-0