Artificial Intelligence (AI) is one of the most promising technologies of the 21. century, with an already noticeable impact on society and the economy. With this work, we provide a short overview of global trends, applications in industry and selected use-cases from our international experience and work in industry and academia. The goal is to present global and regional positive practices and provide an informed opinion on the realistic goals and opportunities for positioning B&H on the global AI scene.
Emerging sets of single-cell sequencing data makes it appealing to apply existing tumor phylogeny reconstruction methods to analyze associated intratumor heterogeneity. Unfortunately, tumor phylogeny inference is an NP-hard problem and existing principled methods typically fail to scale up to handle thousands of cells and mutations observed in emerging single-cell data sets. Even though there are greedy heuristics to build hierarchical clustering of cells and mutations, they suffer from well-documented issues in accuracy. Additionally even when “optimal” solutions are feasible, existing approaches only provide a single “most likely” tree to depict the evolutionary processes that may result in an observed collection of cells and mutations. To make matters worse, the vast majority of single-cell sequencing data sets are transcriptomic and as a result, suffer from considerable variation in coverage across mutational loci. In this paper, we introduce Trisicell, a computational toolkit for scalable tumor phylogeny reconstruction and validation from single-cell genomic, exomic or transcriptomic sequencing data. Trisicell has three components: (i) Trisicell-DnC, a new tumor phylogeny reconstruction method from genotype matrices derived from single-cell data, (ii) Trisicell-ConT a new algorithm for constructing the consensus for two or more tumor phylogenies - which may be built through the use of different data types on the same set of cells, or built through the use of different methods on the same data, and (iii) Trisicell-PF, a new partition function method for assessing the likelihood of any user-defined subtree/set of cells to be seeded by a given set of mutations in the phylogeny. Collectively, these tools provide means of identifying and validating robust portions of a tumor phylogeny, offering the ability to focus on the most important (sub)clones and the genomic alterations that seed the associated clonal expansion. We applied Trisicell to a panel of clonal sublines derived from single-cells of a parental mouse melanoma model on which we performed both whole exome and whole transcriptome sequencing. The tumor phylogenies of the clonal sublines built on exomic and transcriptomic mutations by Trisicell-DnC, were shown by Trisicell-ConT to be highly similar and the subtrees comprised of phenotypically similar clonal sublines were shown to be strongly associated by Trisicell-PF to their seeding mutations. In addition, we applied Trisicell to single-cell whole transcriptome sequencing data from a tumor derived from the same parental melanoma cell line, which was subjected to anti-CTLA-4 immunotherapy. The phylogenies generated from both studies featured distinct subtrees, strongly associated with phenotypes including cell differentiation status, tumor growth and therapeutic response. These results suggest that Trisicell can be used for scalable tumor phylogeny reconstruction and validation through both single-cell and clonal-subline sequencing data, which may reveal strong phenotypic associations. In particular, they suggest that the developmental status and phenotypic intratumoral heterogeneity of melanoma originates from observable subclonal variation. Citation Format: Farid Rashidi Mehrabadi, Salem Malikic, Kerrie L. Marie, Eva Perez-Guijarro, Erfan Sadeqi Azer, Howard H. Yang, Can Kizilkale, Charli Gruen, Huaitian Liu, Christina Marcelus, Aydin Buluc, Funda Ergun, Maxwell P. Lee, Glenn Merlino, Chi-Ping Day, S. Cenk Sahinalp. Trisicell: Scalable Tumor Phylogeny Reconstruction and Validation Reveals Developmental Origin and Therapeutic Impact of Intratumoral Heterogeneity [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr LB019.
The heritability of methylation patterns in tumor cells, as shown in recent studies, suggests that tumor heterogeneity and progression can be interpreted and predicted in the context of methylation changes. To elucidate methylation-based evolution trajectory in tumors, we introduce a novel computational method for methylation phylogeny reconstruction leveraging single cell bisulfite treated whole genome sequencing data (scBS-seq), incorporating additional copy number information inferred independently from matched single cell RNA sequencing (scRNA-seq) data, when available. We validate our method with the scBS-seq data of multi-regionally sampled colorectal cancer cells, and demonstrate that the cell lineages constructed by our method strongly correlate with original sampling regions. Our method consists of three components: (i) noise-minimizing site selection, (ii) likelihood-based sequencing error correction, and (iii) pairwise expected distance calculation for cells, all designed to mitigate the effect of noise and uncertainty due to data sparsity commonly observed in scBS-seq data. In (i), we present an integer linear program-based biclustering formulation to select a set of CpG-sites and cells so that the number of CpG-sites with non-zero coverage in the selected cells is maximized. This procedure filters out cells with read information in too few sites and CpG-sites with read information in too few cells. In (ii), we address the sequencing errors commonly encountered in currently available platforms with a maximum log likelihood approach to correct likely sequencing errors in scBS-seq reads, incorporating CpG-site copy number information in case it can be orthogonally obtained. Given the copy number and read information for a site in a cell, together with the overall sequencing error probability, we compute the log likelihood for all possible underlying allele statuses. If the mixed read statuses at the CpG-site for the cell are more likely due to sequencing error on homozygous alleles as opposed to the presence of alleles mixed methylation statuses, we correct the reads of the minority methylation status to the majority one. In (iii), we introduce a formulation to estimate distances between any pair of cells. As scBS-seq data is typically characterized by shallow read coverage, there is rarely read count evidence for two (or more, depending on CNV status) alleles at a CpG-site. Since allele-specific methylation has been shown to have increased frequency in cancer tissues, given the reads at a CpG-site, it is especially important to consider the possibility of unobserved alleles and their methylation status when determining the CpG-site9s possible methylation zygosities. Our method incorporates copy number information when available, and for each CpG-site in a cell, we compute a probability distribution across all possible methylation zygosities. Then, given specific distance values between pairs of distinct zygosities and the likelihood of each possible zygosity for each shared CpG-site in both cells, we compute the expected total distance between any pair of cells as the mean of expected distances across all shared CpG-sites. We leverage such pairwise distances in methylation phylogeny construction. Citation Format: Xuan C. Li, Yuelin Liu, Farid Rashidi, Salem Malikic, Stephen M. Mount, Eytan Ruppin, Kenneth Aldape, Cenk Sahinalp. Epigenomic tumor evolution modeling with single-cell methylation data profiling [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr LB020.
Advances in single-cell RNA sequencing (scRNAseq) technologies uncovered an unexpected complexity in tumors, underlining the relevance of intratumor heterogeneity to cancer progression and therapeutic resistance. Heterogeneity in the mutational composition of cancer cells is a result of distinct (sub)clonal expansions, each with a distinct metastatic potential and resistance to specific treatments. Unfortunately, due to their low read coverage per cell, scRNAseq datasets are too sparse and noisy to be used for detecting expressed mutations in single cells. Additionally, the large number of cells and mutations present in typical scRNAseq datasets are too large for available computational tools to, e.g., infer distinct subclones, lineages or trajectories in a tumor. Finally, there are no principled methods to assess distinct subclones inferred through single-cell sequencing data and the genomic alterations that seed and potentially cause them. Here we present Trisicell, a computational toolkit for scalable mutational intratumor heterogeneity inference and assessment from scRNAseq as well as single-cell genome or exome sequencing data. Trisicell allows reliable identification of distinct clonal lineages of a tumor, offering the ability to focus on the most important subclones and the genomic alterations that are associated with tumor proliferation. We comprehensively assessed Trisicell on a melanoma model by comparing distinct lineages and subclones it identifies on scRNAseq data, to those inferred using matching bulk whole exome (bWES) and transcriptome (bWTS) sequencing data from clonal sublines derived from single cells. Our results demonstrate that distinct lineages and subclones of a tumor can be reliably inferred and evaluated based on mutation calls from scRNAseq data through the use of Trisicell. Additionally, they reveal a strong correlation between aggressiveness and mutational composition, both across the inferred subclones, and among human melanomas. We also applied Trisicell to infer and evaluate distinct subclonal expansion patterns of the same mouse melanoma model after treatment with immune checkpoint blockade (ICB). After integratively analyzing our cell-specific mutation calls with their expression profiles, we observed that each subclone with a distinct set of novel somatic mutations is strongly associated with a specific developmental status. Moreover, each subclone had developed a unique ICB-resistance mechanism. These results demonstrate that Trisicell can robustly utilize scRNAseq data to delineate intratumor heterogeneity and help understand biological mechanisms underlying tumor progression and resistance to therapy.
Recent studies on the heritability of methylation patterns in tumor cells, suggest that tumor heterogeneity and progression can be studied through methylation changes. To elucidate methylation-based evolution trajectories in tumors, we introduce a novel computational framework for methylation phylogeny reconstruction, leveraging single cell bisulfite treated whole genome sequencing data (scBS-seq), additionally incorporating copy number information inferred independently from matched single cell RNA sequencing (scRNA-seq) data, when available. Our framework consists of three components: (i) noise-minimizing site selection, (ii) likelihood-based sequencing error correction, and (iii) pairwise expected distance calculation for cells, all designed to mitigate the effect of noise and uncertainty due to data sparsity commonly observed in scBS-seq data. We validate our approach with the scBS-seq data of multi-regionally sampled colorectal cancer cells, and demonstrate that the cell lineages constructed by our method strongly correlate with original sampling regions. Additionally, we show that the constructed phylogeny can be used to impute missing entries, which, in turn, may help reduce sparsity issues in scBS-seq data sets. Contact: cenk.sahinalp@nih.gov
Single-cell sequencing data has great potential in reconstructing the evolutionary history of tumors. Rapid advances in single-cell sequencing technology in the past decade were followed by the design of various computational methods for inferring trees of tumor evolution. Some of the earliest of these methods were based on the direct search in the space of trees. However, it can be shown that instead of this tree search strategy we can perform a search in the space of binary matrices and obtain the most likely tree directly from the most likely among the candidate binary matrices. The search in the space of binary matrices can be expressed as an instance of integer linear or constraint satisfaction programming and solved by some of the available solvers, which typically provide a guarantee of optimality of the reported solution. In this review, we first describe one convenient tree representation of tumor evolutionary history and present tree scoring model that is most commonly used in the available methods. We then provide proof showing that the most likely tree of tumor evolution can be obtained directly from the most likely matrix from the space of candidate binary matrices. Next, we provide integer linear programming formulation to search for such matrix and summarize the existing methods based on this formulation or its extensions. Lastly, we present one use-case which illustrates how binary matrices can be used as a basis for developing a fast deep learning method for inferring some topological properties of the most likely tree of tumor evolution.
Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više