Name: AI Model PROGRxN-BCa Tested in Predicting Bladder Cancer Progression - Girish Kulkarni
Uploaded: 2026-02-11

Ashish Kamat: Hello everybody, and welcome to UroToday. I'm Ashish Kamat, urologic oncologist in Houston, Texas. And today we're going to be talking about a topic that has garnered a lot of attention, not just in medicine, but across the world and in everything we do, which is essentially artificial intelligence. But nowhere is it more crucial to truly understand how this can be used, not just for siloed benefits, but the benefit of a large group of patients. And this is where I'm joined today by Dr. Girish Kulkarni, who's led this really quite a remarkable international evaluation of an AI-based model for predicting risk progression and sub-stratification in bladder cancer, specifically in non-muscle-invasive bladder cancer. So Girish, thank you so much for taking your time. Really excited to hear your insights.

Girish Kulkarni: Thank you so much for this invitation, Dr. Kamat, and for the UroToday crew as well. Yeah, the project we led was creation of this model called PROGRxN-BCa. And it was a project to help us modernize prediction of progression of non-muscle-invasive bladder cancer. And it's been accepted for publication in European Urology, in press right now. The reason we want to do this is because it's become quite clear that we need to have better tools for prognostication. On the left-hand side of the screen, you can see the previous tools we have been using, the EORTC 2006 model, CUETO in 2009, and a more updated EAU risk calculator 2021. The problem with the older models, EORTC and CUETO, were the 1973 WHO grading scheme. Historically, that was fine, but the grading scheme has been updated for NMIBC and for bladder cancer in general, as we know, to 2004 and 2022 WHO grading. So those grading systems were out of date. Plus, in the EORTC, they excluded primary CIS patients. Patients did not receive BCG maintenance, re-resection of T1 tumors was not done. And in CUETO, very similar shortfalls. So the EAU risk calculator was developed to try to address some of these issues. But again, patients were not treated with BCG and they excluded CIS patients as well, even though they used the more modern grading system. And at the very bottom of the table, you can see the C indices for these different calculators on external validation are not as ideal as we would like, 0.62 to 0.66. So we thought there was an opportunity here, especially with AI.

On the right-hand side are the different studies. We did a systematic review of tools used in prediction of progression of non-muscle-invasive bladder cancer and recurrence. And most of the studies published to date were low-quality. There's only been one real high-quality study out of Europe, so we want to attack this problem. So to do this, we led an international retrospective cohort study and we followed a STREAM-URO framework, which basically is a standardized approach to reporting and performing machine-learning applications. We've published that for everyone to follow in European Urology Focus, and we followed that framework. We assembled the largest cohort of NMIBC patients of its kind, almost 13,000 patients. 3,324 were used as a training cohort at four academic and community hospitals in the Toronto area. And then we validated throughout North America and Europe at 30 academic institutions, over 9,000 patients. The AI model we used, it was called a random survival forest, and we trained it on 14 readily available clinicopathological features. So these are features or variables that are widely available to everyone, stage, grade, concomitant CIS, number of tumors, size, for example. And there were 14 of those. We didn't use more modern biomarkers like ctDNA or pathomic-type data like slides and pixels from those types of slides because we want to make this really readily available for everyone right now. Our outcome was time to progression defined as muscle-invasive disease or metastatic disease, and we compared this to the EAU risk calculator.

We evaluated using C indices, calibration, net benefit, and bias assessment, and we performed a number of subgroup analyses. I've listed some here. For example, we had WHO 1973 grade available for a number of our patients. So we did a subgroup with that grading system in addition to the 2004 and 2022, and we sub-stratified intermediate and high-risk groups. We also looked at patients who received BCG and who did not, and those who had concordant care, guideline-concordant care. And the take-home message at the top is the overall C index, which is the AUC or the ROC from the model. And in the training cohort, PROGRxN-BCa, that's the name of our model, was at 0.83, the EAU risk calculator at 0.76. And on external validation in that 30 academic center validation set was 0.79 for progression and 0.71 for the EAU risk calculator. We also did some subgroup-specific calculations as well. This is to assess bias, to ensure that the model runs properly and gives similar results regardless of the subgroup. This is also called an algorithmic assessment. And you can see that everything demonstrated that progression was better than the EAU risk calculator, save for a few comparisons. For example, the EAU risk calculator was just as good in patients who did not receive BCG, which is what it was designed for or designed originally for anyways. Perhaps one of the more interesting things we found out also, it was another analysis, a subgroup on intermediate-risk and NMIBC. Now, the IBCG has a number of risk factors and we've listed them at the bottom, multiple tumors, early recurrences within a year, frequent recurrences, more than one a year, large tumors, and previous failed intravesical therapy.

And based on the number of risk factors, that stratification or risk stratification approach with IBCG is quite good for predicting recurrence in the intermediate-risk NMIBC subset, and that's been validated. However, with progression, it hasn't panned out as well as with recurrence. And you can see on the left-hand side in our dataset, it didn't distinguish between zero and one to two risk factors quite well. Whereas our model, when we split it up into tertiles based on the progression risk score, had slightly better separation. So we think this one may be good for progression of intermediate-risk non-muscle-invasive bladder cancer, whereas the IBCG approach may be the ideal model for recurrence, for example, food for thought. So this model is readily available at PROGRXN.CA. It's a GitHub website application. You can put in all of your variables, the 14, and it'll give individual patient results. You could even upload your dataset to see how well the model predicts in your dataset as a form of validation. Overall, this is the largest NMIBC prognostic study of its kind. It's a hundred times larger than any other AI study. It outperforms the EAU risk calculator by about 10% using readily available data that everyone should have. It's agnostic to concordance to clinical practice guidelines, which is the reality out there. Not everyone is following the guidelines. So regardless of whether you're following them or not, the model should work for you. It improves sub-stratification, and I showed an example of intermediate risk, enhances risk-adaptive management.

For example, if a patient has a high risk of progression, you may offer them more aggressive radical therapy, and it may broaden eligibility for clinical trials. So instead of using guideline stratifications such as low, intermediate, or high risk, we could use risk of progression based on the model for entry into a clinical trial. This would not have been possible without the help of many, many others, all the centers that participated. Dr. Jethro Kwong was my master's student who did 99.9% of the work here, so I want to give him credit for that. And one of my colleagues, Dr. Alex Zlotta, was instrumental in helping bring everyone together and refining the model as well. So I want to thank them especially. Thank you very much.

Ashish Kamat: Thanks so much, Girish. And thanks for obviously the invitation for us to participate in this effort. I think it's really great that we have so many different centers, global representation from patients, all the different walks of life. It really adds to the robustness of this effort. A couple of questions, and this is more, I guess, educational for our audience. Share with us a little bit as to why this effort is truly different from prior nomogram efforts or just sliding scale efforts. How exactly fundamentally does this help get over those hurdles?

Girish Kulkarni: I think it's the ability of AI to find interactions and relationships in models that our traditional standard linear-type models don't allow. So the random survival forest allows that. So interactions we may not see, and it helps account for non-proportional hazards as well. You'll see in the publication when it comes out, there's an editorial that talks about it. It's a very favorable editorial, but they discuss some of those benefits, and it's a really well-written editorial on the paper. So that's how AI helps, even though it's the exact same variables that we would put into a Cox model, for example, the machine-learning aspect allows other relationships that would not otherwise be identifiable to be identifiable.

Ashish Kamat: Exactly. I mean, it's the way AI works, I guess, even when we are putting in requests in LLMs or ChatGPT or Claude or whatever it is, because it's finding the relationship, the shortest distance between two data points. And here it's able to... I guess one way to explain this to folks is that it's able to remember the start of the sentence and the variable while it's reading the end of the clinical information and then make references in real time, which brings me to the next question. Is the thought process moving forward for this calculator, AI-based calculator, to constantly be self-learning, or at some point is this going to be boxed and said, "Okay, this is the training, this is what we've learned, this is the tool."?

Girish Kulkarni: It's a good question. Right now, we have not made it dynamic in that manner such that there is self-learning. That would require additional datasets, I think, to be input. So right now it is static, but it is something that we need to think about in terms of a cohort effect is what the old epidemiological term is, to make sure that it stays up to date. That certainly is something we would like to look at. And I think that will also include adding in pathologic-type data and maybe not so important in NMIBC, but if we ever expand to MIBC, radiomic-type data.

Ashish Kamat: And you preempted the question I was going to bring up anyways, but I'll bring that up. So obviously we, and I'm sure you have collaborated with a lot of folks that are doing AI or machine-learning-based algorithms and MRIs when it comes to predicting complete clinical response, ctDNA, of course, tissue with the Valar labs and the ability to prognosticate and in some ways predict, but more prognosticate patients. So how do you see this? I know it hasn't been tested or cross-platform comparisons have not been actually done at the backend, but how do you see something like this, which is based purely on clinical data, fitting in with molecular, radiomics, multiomics essentially? Do you think that they should in some ways be like Perplexity where one query then goes out to multiple platforms? Or do you think they would ideally still exist in relative silos that talk to each other, but independent of a common query?

Girish Kulkarni: Overall, I think the way the field is moving is to have multimodal AI. So this is a unimodal-type model. It's very adaptable for what the real world is now. And I think it should probably supplant the EAU risk calculator because we've demonstrated performance that exceeds that. Ultimately though, the future I think is to have the pathologic data, additional biomarkers as they become available. Whether it's going to be siloed or allowing data to be exchanged, I think is an intriguing question. Probably data sharing would have to be an issue that we have to address in terms of privacy and making sure that some of these variables are allowed to go to different servers. Ideally, I think if we share data and we can leverage the information other people put forth, it'll only make the model stronger. So I would really like that in the future. I don't think we're there yet, but that may be a longer-term project.

Ashish Kamat: And again, the fear always arises when we talk about general AI because people are clearly worried about the individual projects such as these being essentially cannibalized by the mothership of AI, but that's a whole different conspiracy, not conspiracy, but that's a whole different theory of AI taking over everything we do. Girish, this is such an exciting topic. We could talk forever, but obviously in the interest of time, we've got to call it a time. Thank you so much as always, and congratulations.

Girish Kulkarni: Thank you so much. My pleasure for joining and thanks for the offer and invitation.

AI Model PROGRxN-BCa Tested in Predicting Bladder Cancer Progression - Girish Kulkarni

Login