Logo

Publikacije (68)

Nazad
Xuan Cindy Li, Yuelin Liu, Farid Rashidi, S. Malikić, Stephen M. Mount, E. Ruppin, Kenneth D. Aldape, Cenk Sahinalp

The heritability of methylation patterns in tumor cells, as shown in recent studies, suggests that tumor heterogeneity and progression can be interpreted and predicted in the context of methylation changes. To elucidate methylation-based evolution trajectory in tumors, we introduce a novel computational method for methylation phylogeny reconstruction leveraging single cell bisulfite treated whole genome sequencing data (scBS-seq), incorporating additional copy number information inferred independently from matched single cell RNA sequencing (scRNA-seq) data, when available. We validate our method with the scBS-seq data of multi-regionally sampled colorectal cancer cells, and demonstrate that the cell lineages constructed by our method strongly correlate with original sampling regions. Our method consists of three components: (i) noise-minimizing site selection, (ii) likelihood-based sequencing error correction, and (iii) pairwise expected distance calculation for cells, all designed to mitigate the effect of noise and uncertainty due to data sparsity commonly observed in scBS-seq data. In (i), we present an integer linear program-based biclustering formulation to select a set of CpG-sites and cells so that the number of CpG-sites with non-zero coverage in the selected cells is maximized. This procedure filters out cells with read information in too few sites and CpG-sites with read information in too few cells. In (ii), we address the sequencing errors commonly encountered in currently available platforms with a maximum log likelihood approach to correct likely sequencing errors in scBS-seq reads, incorporating CpG-site copy number information in case it can be orthogonally obtained. Given the copy number and read information for a site in a cell, together with the overall sequencing error probability, we compute the log likelihood for all possible underlying allele statuses. If the mixed read statuses at the CpG-site for the cell are more likely due to sequencing error on homozygous alleles as opposed to the presence of alleles mixed methylation statuses, we correct the reads of the minority methylation status to the majority one. In (iii), we introduce a formulation to estimate distances between any pair of cells. As scBS-seq data is typically characterized by shallow read coverage, there is rarely read count evidence for two (or more, depending on CNV status) alleles at a CpG-site. Since allele-specific methylation has been shown to have increased frequency in cancer tissues, given the reads at a CpG-site, it is especially important to consider the possibility of unobserved alleles and their methylation status when determining the CpG-site9s possible methylation zygosities. Our method incorporates copy number information when available, and for each CpG-site in a cell, we compute a probability distribution across all possible methylation zygosities. Then, given specific distance values between pairs of distinct zygosities and the likelihood of each possible zygosity for each shared CpG-site in both cells, we compute the expected total distance between any pair of cells as the mean of expected distances across all shared CpG-sites. We leverage such pairwise distances in methylation phylogeny construction. Citation Format: Xuan C. Li, Yuelin Liu, Farid Rashidi, Salem Malikic, Stephen M. Mount, Eytan Ruppin, Kenneth Aldape, Cenk Sahinalp. Epigenomic tumor evolution modeling with single-cell methylation data profiling [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr LB020.

F. Mehrabadi, Kerrie L. Marie, Eva Pérez-Guijarro, S. Malikić, Erfan Sadeqi Azer, Howard H. Yang, Can Kızılkale, Charli Gruen et al.

Advances in single-cell RNA sequencing (scRNAseq) technologies uncovered an unexpected complexity in tumors, underlining the relevance of intratumor heterogeneity to cancer progression and therapeutic resistance. Heterogeneity in the mutational composition of cancer cells is a result of distinct (sub)clonal expansions, each with a distinct metastatic potential and resistance to specific treatments. Unfortunately, due to their low read coverage per cell, scRNAseq datasets are too sparse and noisy to be used for detecting expressed mutations in single cells. Additionally, the large number of cells and mutations present in typical scRNAseq datasets are too large for available computational tools to, e.g., infer distinct subclones, lineages or trajectories in a tumor. Finally, there are no principled methods to assess distinct subclones inferred through single-cell sequencing data and the genomic alterations that seed and potentially cause them. Here we present Trisicell, a computational toolkit for scalable mutational intratumor heterogeneity inference and assessment from scRNAseq as well as single-cell genome or exome sequencing data. Trisicell allows reliable identification of distinct clonal lineages of a tumor, offering the ability to focus on the most important subclones and the genomic alterations that are associated with tumor proliferation. We comprehensively assessed Trisicell on a melanoma model by comparing distinct lineages and subclones it identifies on scRNAseq data, to those inferred using matching bulk whole exome (bWES) and transcriptome (bWTS) sequencing data from clonal sublines derived from single cells. Our results demonstrate that distinct lineages and subclones of a tumor can be reliably inferred and evaluated based on mutation calls from scRNAseq data through the use of Trisicell. Additionally, they reveal a strong correlation between aggressiveness and mutational composition, both across the inferred subclones, and among human melanomas. We also applied Trisicell to infer and evaluate distinct subclonal expansion patterns of the same mouse melanoma model after treatment with immune checkpoint blockade (ICB). After integratively analyzing our cell-specific mutation calls with their expression profiles, we observed that each subclone with a distinct set of novel somatic mutations is strongly associated with a specific developmental status. Moreover, each subclone had developed a unique ICB-resistance mechanism. These results demonstrate that Trisicell can robustly utilize scRNAseq data to delineate intratumor heterogeneity and help understand biological mechanisms underlying tumor progression and resistance to therapy.

X. Li, Yuelin Liu, F. Mehrabadi, S. Malikić, Stephen M. Mount, E. Ruppin, K. Aldape, S. C. Sahinalp

Recent studies on the heritability of methylation patterns in tumor cells, suggest that tumor heterogeneity and progression can be studied through methylation changes. To elucidate methylation-based evolution trajectories in tumors, we introduce a novel computational framework for methylation phylogeny reconstruction, leveraging single cell bisulfite treated whole genome sequencing data (scBS-seq), additionally incorporating copy number information inferred independently from matched single cell RNA sequencing (scRNA-seq) data, when available. Our framework consists of three components: (i) noise-minimizing site selection, (ii) likelihood-based sequencing error correction, and (iii) pairwise expected distance calculation for cells, all designed to mitigate the effect of noise and uncertainty due to data sparsity commonly observed in scBS-seq data. We validate our approach with the scBS-seq data of multi-regionally sampled colorectal cancer cells, and demonstrate that the cell lineages constructed by our method strongly correlate with original sampling regions. Additionally, we show that the constructed phylogeny can be used to impute missing entries, which, in turn, may help reduce sparsity issues in scBS-seq data sets. Contact: cenk.sahinalp@nih.gov

Matthew H. Bailey, W. Meyerson, L. Dursi, Liang-Bo Wang, Guanlan Dong, Wen-Wei Liang, A. Weerasinghe, Shantao Li et al.

Correction to this paper has been published: https://doi.org/10.1038/s41467-020-20128-w

Matthew H. Bailey, W. Meyerson, L. Dursi, Liang-Bo Wang, Guanlan Dong, Wen-Wei Liang, A. Weerasinghe, Shantao Li et al.

The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts. With the generation of large pan-cancer whole-exome and whole-genome sequencing projects, a question remains about how comparable these datasets are. Here, using The Cancer Genome Atlas samples analysed as part of the Pan-Cancer Analysis of Whole Genomes project, the authors explore the concordance of mutations called by whole exome sequencing and whole genome sequencing techniques.

Constance H. Li, S. Prokopec, Ren X. Sun, Fouad Yousif, Nathaniel Schmitz, Fatima Gurnit Peter J. Andrew V. Paul C. Peter J. David K Al-Shahrour Atwal Bailey Biankin Boutros Campbell , F. Al-Shahrour, Gurnit Atwal et al.

Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biology, and in most cancers outside the sex organs. Efforts to link these clinical differences to specific molecular features have focused on somatic mutations within the coding regions of the genome. Here we report a pan-cancer analysis of sex differences in whole genomes of 1983 tumours of 28 subtypes as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We both confirm the results of exome studies, and also uncover previously undescribed sex differences. These include sex-biases in coding and non-coding cancer drivers, mutation prevalence and strikingly, in mutational signatures related to underlying mutational processes. These results underline the pervasiveness of molecular sex differences and strengthen the call for increased consideration of sex in molecular cancer research. There’s an emerging body of evidence to show how biological sex impacts cancer incidence, treatment and underlying biology. Here, using a large pan-cancer dataset, the authors further highlight how sex differences shape the cancer genome.

S. Malikić, F. Mehrabadi, Erfan Sadeqi Azer, Mohammad Haghir Ebrahim-Abadi, S. C. Sahinalp

Single-cell sequencing (SCS) data have great potential in reconstructing the evolutionary history of tumors. Rapid advances in SCS technology in the past decade were followed by the design of various computational methods for inferring trees of tumor evolution. Some of the earliest methods were based on the direct search in the space of trees with the goal of finding the maximum likelihood tree. However, it can be shown that instead of searching directly in the tree space, we can perform a search in the space of binary matrices and obtain maximum likelihood tree directly from the maximum likelihood matrix. The potential of the latter tree search strategy has recently been recognized by different research groups and several related methods were published in the past 2 years. Here we provide a review of the theoretical background of these methods and a detailed discussion, which are largely missing in the available publications, of the correlation between the two tree search strategies. We also discuss each of the existing methods based on the search in the space of binary matrices and summarize the best-known single-cell DNA sequencing data sets, which can be used in the future for assessing performance on real data of newly developed methods.

S. Dentro, I. Leshchiner, Kerstin Haase, M. Tarabichi, J. Wintersinger, A. Deshwar, Kaixian Yu, Yulia Rubanova et al.

Ermin Hodzic, Raunak Shrestha, S. Malikić, C. Collins, K. Litchfield, S. Turajlic, S. C. Sahinalp

Motivation As multi-region, time-series, and single cell sequencing data become more widely available, it is becoming clear that certain tumors share evolutionary characteristics with others. In the last few years, several computational methods have been developed with the goal of inferring the subclonal composition and evolutionary history of tumors from tumor biopsy sequencing data. However, the phylogenetic trees that they report differ significantly between tumors (even those with similar characteristics). Results In this paper, we present a novel combinatorial optimization method, CONETT, for detection of recurrent tumor evolution trajectories. Our method constructs a consensus tree of conserved evolutionary trajectories based on the information about temporal order of alteration events in a set of tumors. We apply our method to previously published datasets of 100 clear-cell renal cell carcinoma and 99 non-small-cell lung cancer patients and identify both conserved trajectories that were reported in the original studies, as well as new trajectories. Availability CONETT is implemented in C++ and available at https://github.com/ehodzic/CONETT.

Erfan Sadeqi Azer, Mohammad Haghir Ebrahimabadi, Mohammad Haghir Ebrahimabadi, S. Malikić, S. Malikić, R. Khardon, S. C. Sahinalp

Erfan Sadeqi Azer, Farid Rashidi Mehrabadi, S. Malikić, Xuan Cindy Li, Osnat Bartok, Kevin Litchfield, Ronen Levy, Yardena Samuels et al.

Motivation Recent advances in single cell sequencing (SCS) offer an unprecedented insight into tumor emergence and evolution. Principled approaches to tumor phylogeny reconstruction via SCS data are typically based on general computational methods for solving an integer linear program (ILP), or a constraint satisfaction program (CSP), which, although guaranteeing convergence to the most likely solution, are very slow. Others based on Monte Carlo Markov Chain (MCMC) or alternative heuristics not only offer no such guarantee, but also are not faster in practice. As a result, novel methods that can scale up to handle the size and noise characteristics of emerging SCS data are highly desirable to fully utilize this technology. Results We introduce PhISCS-BnB, a Branch and Bound algorithm to compute the most likely perfect phylogeny (PP) on an input genotype matrix extracted from a SCS data set. PhISCS-BnB not only offers an optimality guarantee, but is also 10 to 100 times faster than the best available methods on simulated tumor SCS data. We also applied PhISCS-BnB on a large melanoma data set derived from the sub-lineages of a cell line involving 24 clones with 3574 mutations, which returned the optimal tumor phylogeny in less than 2 hours. The resulting phylogeny also agrees with bulk exome sequencing data obtained from in vivo tumors growing out from the same cell line. Availability https://github.com/algo-cancer/PhISCS-BnB

Marek Cmero, Ke Yuan, Cheng Soon Ong, J. Schröder, David J. Pavana Rameen Paul C. David D. L. Peter J. Shao Adams Anur Beroukhim Boutros Bowtell Campbell Cao, D. Adams, Pavana Anur, R. Beroukhim et al.

We present SVclone, a computational method for inferring the cancer cell fraction of structural variant (SV) breakpoints from whole-genome sequencing data. SVclone accurately determines the variant allele frequencies of both SV breakends, then simultaneously estimates the cancer cell fraction and SV copy number. We assess performance using in silico mixtures of real samples, at known proportions, created from two clonal metastases from the same patient. We find that SVclone’s performance is comparable to single-nucleotide variant-based methods, despite having an order of magnitude fewer data points. As part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we use SVclone to reveal a subset of liver, ovarian and pancreatic cancers with subclonally enriched copy-number neutral rearrangements that show decreased overall survival. SVclone enables improved characterisation of SV intra-tumour heterogeneity. The authors present SVclone, a computational method for inferring the cancer cell fraction of structural variants from whole-genome sequencing data.

Yulia Rubanova, Ruian Shi, Caitlin F. Harrigan, Roujia Li, J. Wintersinger, Nil Sahin, A. Deshwar, Stefan C. Ignaty Moritz Clemency Kerstin Maxime Jeff Amit Dentro Leshchiner Gerstung Jolly Haase Tarabichi W et al.

The type and genomic context of cancer mutations depend on their causes. These causes have been characterized using signatures that represent mutation types that co-occur in the same tumours. However, it remains unclear how mutation processes change during cancer evolution due to the lack of reliable methods to reconstruct evolutionary trajectories of mutational signature activity. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we present TrackSig, a new method that reconstructs these trajectories using optimal, joint segmentation and deconvolution of mutation type and allele frequencies from a single tumour sample. In simulations, we find TrackSig has a 3–5% activity reconstruction error, and 12% false detection rate. It outperforms an aggressive baseline in situations with branching evolution, CNA gain, and neutral mutations. Applied to data from 2658 tumours and 38 cancer types, TrackSig permits pan-cancer insight into evolutionary changes in mutational processes. Cancers evolve as they progress under differing selective pressures. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, the authors present the method TrackSig the estimates evolutionary trajectories of somatic mutational processes from single bulk tumour data.

Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više