Amol Shetty: Thank you for having me.
Matthew Cooperberg: So, tell us a little bit about this project, how it got started and what we're hoping to learn about the black box.
Amol Shetty: I'm very interested in AI and trying to understand why people want to use it. And obviously the one thing I hear from everyone is, "Yes, we want to use it, but we don't understand what it's doing," and that is where our work started, is what is the AI looking at? Coming back to this ArteraAI, it looks at digital pathology and tells you something about, is this person going to have additional metastasis or not? Obviously that was all done in localized cancer, and our cohort was mostly oligometastatic cancer. And so, our goal was, again, obviously the AI is looking at something at the biology in these images, but we don't understand it yet. And so, we were like, "Let's look at the biology." And so, we started by looking at spatial transcriptomics, which is a way to overlay both the digital pathology and the underlying biology for the same slide, and we mainly focused on the regions of high attention. The AI is focused at some regions in these pathology images as compared to others, and so we were looking at what is in these high-attention regions that are important to the AI?
Matthew Cooperberg: Talk a little more about the way the AI is working. So, the scanned in digital pathology image that Artera gets or any other pathology AI platform gets, we're going from dots to vectors to layers of complexity, right?
Amol Shetty: Yes.
Matthew Cooperberg: So, when you talk about attention, what are you getting as an output of attention from the deconvolutional layers?
Amol Shetty: Exactly. So, think of an image that is then, in computer language, converted into ones and zeros. And so, every time an image comes in, it's broken down into multiple patches, and each of those patches then, based on how the AI is trained, gives you a 128 vector score. So, imagine it identifies the best features that tell you something about that image, and it does that across the entire length and breadth of the slide. That 128 vector is then combined with clinical features should give you a final MMAI score. And so, the 128 feature vector itself is driven by the image, and that image is hopefully telling you about biology. And so, we looked at that 128 feature vector that is biology-driven and what is the biology that's driving it.
Matthew Cooperberg: And that's the question, right? And I found it fascinating that both with the visual AIs, and with the LLMs as well, ChatGPT and the like, we really have no idea what's going on in these intervening layers, in these 120 layers or 128 layers, and that's what you're hoping to understand, right?
Amol Shetty: Yes.
Matthew Cooperberg: So, what sort of data do you actually receive from the PathAI analysis, and what did you do in the spatial transcriptomics to try to explain the biology?
Amol Shetty: So, from the images we get, what we call is basically a 128 feature vector, it's a bunch of numbers that tells you, out of all the features that the AI is looking at, which are the most important, and what's the score for every individual. And we also get an MMAI score that is a composite of that entire slide. From the spatial transcriptomics, what we get is a genomic signal of what genes are expressed at every spot on that same slide. So, you can imagine a one-to-one correspondence is if I look at a spot in the H&E image, I can also identify which genes are expressed in the same spot on the spatial transcriptomics. And so, we overlaid those over each other using image registration, and that tells you exactly why a certain spot that's in high attention is important. And we asked it two questions is, is the cell composition different at that spot? Are we looking at different types of cells? And then if we are, what types of molecular signals or pathways are enriched?
And so, when we asked the first question, we were able to identify these malignant luminal epithelial cells that were very specific to each of these patients, and those were the reasons that had high attention. So, the AI was for some reason looking at these specific spots on the image that had these malignant luminal epithelial cells. In some patients, it was epithelial cells, in other patients, it was the stromal regions around those epithelial cells, but it was of interest. And so, we focused on those regions specifically to identify why these regions, and what is it that the regions are making it different.
Matthew Cooperberg: And this is on the Visium platform, right?
Amol Shetty: This is on the-
Matthew Cooperberg: So, you have thousands of genes to look at.
Amol Shetty: Yes. For each of the samples, we get information about nearly 5,000 spots across the length of the image, and we get about 4,000, 5,000 genes per spot. And so, looking at the expression of those genes for every spot across all these images.
Matthew Cooperberg: And how big was your cohort? How many cases?
Amol Shetty: We had six patients from which we kept four based on the quality of the images, and we had six slides from each of these six patients. This was a small cohort for the spatial transcriptomics, but the larger cohort was then used for bulk transcriptomics and for whole-genome sequencing panel of mutations.
Matthew Cooperberg: So, at the end of the day, what are you learning then about what's happening in the black box? Do you think it's mostly cellular features? Do you think it's organization of cells, of structures, of glands? How do we get to understanding what is the computer really thinking?
Amol Shetty: Yeah. From our initial assessment, it seems like this AI was focused on these malignant luminal epithelial cells, and the molecular pathways that were driven in these cells were coming from this MYC enrichment. So, MYC signaling, mTOR signaling that were very known in the field for cell proliferation. And that is something that we were happy to see because in this case of oligometastatic cancer, these cells are looking to proliferate and metastasize.
Matthew Cooperberg: Yeah.
Amol Shetty: When we look at the stromal cells in the same high-attention regions, they were enhanced for EMT signal. And again, that's another signal that is important for metastasis. And so, we were happy to see signals that made sense in this field. And so, that is where we bring this confidence into the AI is looking at biology that is meaningful to the actual disease.
Matthew Cooperberg: And do you think the cellular, it's the cellular morphology that is driving this or is it the arrangement of the cells? There's layers of complexity, there's layers of geography we can consider here. So, at what level of morphology do you get the sense the computer is developing its attention? When you look at the attention heat-maps, is it individual cellular layer or is it larger features?
Amol Shetty: I would say it's a little bit of both, because when we looked at the... We found different malignant luminal epithelial cells, but it was focused on one of those. So, it's looking at something about the cellular feature and the layers that are in there, but then at the same time, in different patients, it was looking at different things. And so, that means that it was, at the end of the day, focused on what is it that's driving the microenvironment around that tumor?
Matthew Cooperberg: Yeah. So, what's your next step with this? Where do you take this?
Amol Shetty: And so in some sense, if we can understand the biology that's underlying the AI, then hopefully obviously in one way builds confidence about what the AI is seeing, but at the same time, from a precision medicine point, it helps understand what is it that we can do for the patients. In this case, we want to know if we can use MDT to treat the patient versus something else. Or it might help in normal drug therapies and identifying what drugs might work, and what drugs might not.
Matthew Cooperberg: Do you think it also helps the AI field? So, when you step back, a company like Artera or the other companies developing these tools, they don't really have to care about the black box, it just has to work.
Amol Shetty: Yes.
Matthew Cooperberg: And as long as it works and they get some great papers and they're in the guidelines, we can order the test with confidence that we're going to have something helpful for the patient at the point of care, but obviously at a more fundamental level, yeah, we really want to know, what is the biology that the computer is reading? So, do you think there's a feedback loop where this type of information will actually help the AI developers come up with better AI algorithms by reflecting some of the biology that we're learning concurrent with the image?
Amol Shetty: Yes. If the AI is using the biology information, it's going to make it better, there's a nuance into it. Obviously timing is of importance. And so, if the additional layer of biology increases the time in getting back these tests, then does that not really feed the purpose?
Matthew Cooperberg: Sure.
Amol Shetty: But at the same time, yes, we want to get more accurate results, we want to get more accurate predictions, and the biology might be able to help with that, right?
Matthew Cooperberg: Not even so much incorporating genomic results with the image to come up with a composite score, but does the feeding the genomic information into a next iteration of the PathAI tool itself help the computer do a better job focusing its attention in a way? So, right now the MMAI incorporates clinical information together with the image to come up with the score. So, if there's a way to feed some of the biologic interpretation information into the model as they develop version 3.7, or whatever it's going to be, do you think there's a scenario where that helps come up with a better visual algorithm if you're giving the computer a little bit more information about what it's seeing? Or does it have to remain entirely neutral and just image-based?
Amol Shetty: It's tough to say at this point because you'd have to do the testing to know if the score would get infinitely better by adding the biology to it. But at the same time, it might help to reduce the negatives. So, it might help you reduce something that you may have missed because you're focused on one part of the image versus another. I think that might be more helpful.
Matthew Cooperberg: Yeah. Terrific. Anything else you want to share?
Amol Shetty: From a research standpoint, I think being able to understand what the AI is looking at would give us more confidence in why we should use it, but at the same time, it'll also open up avenues to what therapies we can do. So, if we use it, you can take your car to a mechanic and the mechanic tells you there's something wrong with it, right? You can run it for the next 30 miles. But if you don't know how you can fix it, then you only know that there's something wrong. I think that's where being able to understand what's underlying the AI will help us find more and more ways to fix the thing.
Matthew Cooperberg: Yeah. Very exciting time, obviously very, very fast-moving field. So, thanks for your contributions and thanks for joining us.
Amol Shetty: Thank you.