Name: Large-Scale Whole Genome Sequencing Study of Localized Prostate Cancer Tumors - Paul Boutros
Uploaded: 2025-08-19

Large-Scale Whole Genome Sequencing Study of Localized Prostate Cancer Tumors - Paul Boutros

August 19, 2025

Andrea Miyahira hosts Paul Boutros to discuss his team's Cancer Discovery publication on prostate cancer heterogeneity. Dr. Boutros describes the largest prostate cancer whole genome sequencing study ever conducted, analyzing localized disease across multiple international sites. The study identified 223 driver regions including novel findings like a chromosome 14 inversion occurring in nearly 50% of patients. They developed seven integrated molecular subtypes that map to different clinical presentations and grade groups. The research revealed that germline variants can predict specific somatic mutations occurring decades later. High-grade tumors showed dramatically more copy number gains and driver mutations than low-grade disease. Dr. Boutros emphasizes two key clinical implications: better identification of truly indolent cancers suitable for surveillance, and development of precision prevention strategies based on genetic predisposition patterns.

Biographies:

Paul Boutros, PhD, MBA, Interim Vice Dean for Research, Professor, Departments of Human Genetics and Urology, David Geffen School of Medicine, UCLA, Los Angeles, CA

Andrea K. Miyahira, PhD, Director of Global Research & Scientific Communications, The Prostate Cancer Foundation

Read the Full Video Transcript

Andrea Miyahira: Hi, I am Andrea Miyahira at the Prostate Cancer Foundation. With me today is Dr. Paul Boutros of UCLA to discuss his team's recent paper, The Germline and Somatic Origins of Prostate Cancer Heterogeneity, published in Cancer Discovery. Dr. Boutros, thanks for joining us.

Paul Boutros: Thank you for having me. It's great to be here. I'm so happy today to give you a quick overview of our work on understanding how the heterogeneity of prostate cancer emerges from both germline and somatic mutational events. And I'll start off by giving a brief pitch of how this study came to be. As you can see, it's a huge team study and it has nodes in Australia, South Africa, Canada, the US, and the UK.

And really the reason it has so many people is because it has so much data, and it really came to be the culmination of data collected in a number of places around the world to create the biggest prostate cancer whole genome sequencing (WGS) study ever. And why would we do that? Whole genome sequencing is expensive, maybe $5,000 per patient, a little bit less, compared to a typical cancer panel or exome, which is in the hundreds to maybe a thousand dollars.

But there are all sorts of things that you can't determine about the genome without sequencing the whole genome. 37% of all driver mutations in prostate cancer are invisible to targeted panels. A good example is FOXA1, a really important driver gene where almost two-thirds of mutations are missed. But you also can't learn things about the tumor's evolution, its mutational processes, and a lot of other features.

And so we put together the largest study of prostate cancer whole genome sequencing focused entirely on localized disease. And then we used that to come up with a compendium of the driver mutations that exist, reimagined subtyping of what the genetic trajectories of the disease are, the clinical hallmarks and how we relate molecular to clinical events, and the germline origins of all of that.

And I'll very quickly show a quick vignette of each of these. In the driver work, we used a number of new statistical and machine learning methods to identify 223 driver regions across the whole genome. They include things that you'd really expect like RB1 and SPOP and TP53, but they also include things nobody's thought of before. Right over here on chromosome 14, there's this small inversion that's very common in prostate cancer.

It occurs in almost 50% of patients, a little bit less. It's never been seen before, and yet it's associated with disease incidence and with disease aggression. And so we have this compendium of all the important mutations that we can then start to think about functionally interrogating. And part of that functional interrogation is asking which mutations work together.

These are called subtypes or molecular subtypes. TCGA did molecular subtypes, and there have been a number of versions since. And because we have the understanding of the whole genome, we have the most comprehensive of these, and we call them the integrated molecular subtypes (IMS). There are seven of them, IMS1 through 7. You can relate them to specific cancer driver genes and you can take a look at their patterns across the genome.

A great example would be IMS7, which is associated with high-level amplification of MYC. IMS1 just has very few mutations. And so we can have these subtypes that tell us about the different ways prostate cancer can evolve and its different trajectories of growth. And we can then link that all the way back to clinical presentation, what we really care about.

And since we're focused on localized disease, we can really take a look at a couple of key characteristics, grade being the most important, and we can see the differences between low-grade, grade group 1, and high-grade tumors. And we can say, for example, that genomic rearrangements are pretty much unchanged, as are indels, very similar across all grades.

By contrast, there are huge differences in the number of copy number gains from low-grade to high-grade disease, both subclonal in the branches of the tumor's evolution and clonal in its trunk. We can then take a look at the total number of driver mutations and see it approximately doubles from low-grade to high-grade disease, but with huge variability within each group.

You can identify specific mutations that are grade-associated. MYC has very low frequency in low-grade disease, but gets up to about a third of high-grade disease, or even BRCA2, which is very rare in low-grade disease, but appreciable at 5% of grade group 5 tumors. And I focused on grade because it's the most prominent prognostic feature in localized disease. But actually you can do this for age, PSA, T category, any feature you like, and you see differences. You can see, for example, that indels are really important in age-related trends. They distinguish early-onset from late-onset prostate cancer, but are not actually associated with any of the clinical prognostic features. And so we have this different pattern of somatic molecular events associated with different aspects of clinical presentation.

But in the end, you still have this question: where does all this start? What's the underlying origin of it? And that goes to the germline genome. And so what we did was ask, are patients that have a specific mutation more likely to have specific genetics? And at the bottom, each of these represents individual single bases in the human genome, 25 of them. And for each, they're associated with the specific mutations that a patient gets.

For example, CHD1 loss is about four times more frequent in individuals who have this germline SNP than those who do not. So to make that clear, we can actually make pretty good predictions about what's going to happen in a man's prostate cancer genome at birth, 60 years before that tumor ever emerges. And so we can put this all together into a picture.

And what we did was take whole genome sequencing of the five different grade groups of localized tumors and we can now map out their evolution. There are germline features at the bottom that are associated with the individual mutations and the subtypes of disease, the integrated molecular subtypes. And those subtypes themselves are more likely to give rise to some or other of the phenotypes.

So for example, grade group 1 tumors are preferentially subtypes 1, 2, and 3, whereas grade group 5 tumors are preferentially subtypes 5, 6, and 7. So we have this link between the germline genome, the somatic mutational genome, and what we ultimately care about, the clinical presentation, which leads to the aggression and the management for those individuals. Thank you. I'm really happy to discuss this work.

Andrea Miyahira: Thank you so much, Dr. Boutros, for sharing this really exciting study. So your study found that certain driver gene alterations such as ETS or NKX3-1 were associated with different patterns of genomic alterations. For instance, I think ETS was associated with elevated copy number loss and NKX3-1 was associated with all types of somatic mutations. So is it more likely that these driver gene alterations were upstream of such genomic patterns being causative, or more likely that they're a product of that type of genomic alteration?

Paul Boutros: Yeah, it's such a great question because this is the magic of whole genome sequencing. Unlike basically any other assay, except for single-cell sequencing, you can map out exactly when the event occurs and what things happen before and after it. And unlike most cancer types, in prostate cancer, almost all the driver events occur really early before a lot of these downstream mutations occur.

And that gives rise to the idea that effectively prostate cancers are born with a certain number of mutations and then they go through this long process of accumulating genomic instability before they escape the prostate and become metastatic. But that means that all of these associated molecular features that we look at are all consequences. They all come afterward. And some might be 100% caused by it, and some might be indirect through immune infiltration or hypoxia differences. But in the end, the somatic event appears to come first.

Andrea Miyahira: As a follow-up, were there germline alterations that associate with these?

Paul Boutros: Yeah, another really good question. We identified 25 germline events that were associated with about 15 of the somatic features. So we found 223 driver regions, 15 is 6 or 7% of them. But actually we're badly underpowered to do that type of analysis. We will have lots of false negatives.

When we do a little bit of power analysis to figure out what the saturation would look like, it looks like we will eventually with larger cohort sizes end up finding about 80% will have a germline event that is associated with it and 20% that do not. So not everything, but the majority, but we're going to need bigger studies to even see that.

Andrea Miyahira: Okay. Well, I look forward to the bigger studies. So what do co-occurring versus mutually exclusive driver gene alterations tell us about cancer development? And are those which are mutually exclusive likely to be synthetic lethal?

Paul Boutros: Yeah, really good question. So when we think about different mutations, there are kind of two ways to think about it. Sometimes they occur in the same cell, they're co-occurring, and sometimes they never occur in the same cell. And as you got at, the ones that don't occur in the same cell are almost certainly going to be in the same pathway. Therefore, it's unlikely that the cancer has a push to mutate both without some sort of treatment or management.

So it goes right down the path of synthetic lethality, at least for the activating mutations. But the other thing it really tells us about when we look at these beyond treatment is it gives us insight about the core origins of prostate cancer. And when two mutations happen, it's telling us that to get from a normal prostate cell to a cancer cell, just one of those isn't good enough.

It probably needs both. At least it needs both to become aggressive and clinically manifest. And when they're mutually exclusive, it actually is probably telling us that there are just different ways that a normal cell can become a prostate cancer cell. And so I often think about the exclusive ones as telling us about the subtypes and the inclusive ones as giving us insight into the processes that are necessary to become tumorigenic and eventually lethal.

Andrea Miyahira: Okay, thank you. And was your study able to differentiate tumor heterogeneity from alterations that co-occur in the same tumor cell?

Paul Boutros: Yeah, good question. So broadly speaking, yes, you can distinguish whether or not two mutations happen in the same cell from whether two mutations are both present in the tumor, but on different branches. But there are always going to be some limitations with that just based on the sensitivity of assays. And the gold standard for doing that is single-cell DNA sequencing, and that's more sensitive.

But of the mutations that we see in prostate cancer, because the important ones happen early, we were able to do that definitively for almost all of the mutations that matter. So it's a nice way of thinking about it. A typical prostate cancer will have 2,000 or 3,000 mutations, of which 5 or 10 are drivers.

We can almost always figure it out for those 5 or 10 drivers. But of the remaining 2,000 or 3,000, we might only know about half. So it's this interesting mix where you don't always know, but you usually know for the important things.

Andrea Miyahira: Okay, great. And what are your biggest take-home messages for studying and understanding the genetic and genomic events that drive prostate cancer?

Paul Boutros: I think there are two. The first is a very clinical observation. Based on these data, it suggests that the majority of cancers are born or destined to have a specific grade. And there may be sampling reasons why we don't detect that clinically, or it could simply be that the cancer is at the point where it's growing to get to that ultimate grade, but hasn't gotten there yet.

But there's this sense of a genetic destiny for each individual cancer because the grade-determining mutations happen early, and that has big impact on our management of localized disease. It should mean we should be able to do an even better job of determining the origins of grade group 1, 2, and even 3 cancers that really probably could be on surveillance.

But on the back end, it means that for these grade groups 4 and 5 cancers and, of course, some 3s and 2s, we should be able to do a much better job in the future with better datasets to really delineate which ones are going to need local therapy versus local plus systemic. And I think the data strongly suggests that that avenue of research is going to be even more fruitful than the existing assays that we have today.

So I think that's the single biggest clinical take-home. The other big take-home is we know that prostate cancer is a very inherited disease with strong family predisposition. We didn't know how much. It's not just inherited as yes/no, you get it, or yes/no, how likely is it to kill somebody, but also the specific mutations, the specific molecular phenotypes. And that means we can start thinking about not just how we would do precision medicine, but also precision prevention.

And we can start looking at dietary, exercise, lifestyle, surveillance and monitoring procedures that are more personalized and customized because of the strong genetic determinism. Long answer, but the two key things are it tells us a lot about the grade origins of prostate cancer and gives us increased focus on precision prevention and precision monitoring assays.

Andrea Miyahira: Okay, Paul, thank you so much for sharing this study with us today, and I look forward to your next study.

Paul Boutros: Thank you so much. It was great talking with you.

Large-Scale Whole Genome Sequencing Study of Localized Prostate Cancer Tumors - Paul Boutros

Login