Motivation As multi-region, time-series, and single cell sequencing data become more widely available, it is becoming clear that certain tumors share evolutionary characteristics with others. In the last few years, several computational methods have been developed with the goal of inferring the subclonal composition and evolutionary history of tumors from tumor biopsy sequencing data. However, the phylogenetic trees that they report differ significantly between tumors (even those with similar characteristics). Results In this paper, we present a novel combinatorial optimization method, CONETT, for detection of recurrent tumor evolution trajectories. Our method constructs a consensus tree of conserved evolutionary trajectories based on the information about temporal order of alteration events in a set of tumors. We apply our method to previously published datasets of 100 clear-cell renal cell carcinoma and 99 non-small-cell lung cancer patients and identify both conserved trajectories that were reported in the original studies, as well as new trajectories. Availability CONETT is implemented in C++ and available at https://github.com/ehodzic/CONETT.
Motivation Recent advances in single cell sequencing (SCS) offer an unprecedented insight into tumor emergence and evolution. Principled approaches to tumor phylogeny reconstruction via SCS data are typically based on general computational methods for solving an integer linear program (ILP), or a constraint satisfaction program (CSP), which, although guaranteeing convergence to the most likely solution, are very slow. Others based on Monte Carlo Markov Chain (MCMC) or alternative heuristics not only offer no such guarantee, but also are not faster in practice. As a result, novel methods that can scale up to handle the size and noise characteristics of emerging SCS data are highly desirable to fully utilize this technology. Results We introduce PhISCS-BnB, a Branch and Bound algorithm to compute the most likely perfect phylogeny (PP) on an input genotype matrix extracted from a SCS data set. PhISCS-BnB not only offers an optimality guarantee, but is also 10 to 100 times faster than the best available methods on simulated tumor SCS data. We also applied PhISCS-BnB on a large melanoma data set derived from the sub-lineages of a cell line involving 24 clones with 3574 mutations, which returned the optimal tumor phylogeny in less than 2 hours. The resulting phylogeny also agrees with bulk exome sequencing data obtained from in vivo tumors growing out from the same cell line. Availability https://github.com/algo-cancer/PhISCS-BnB
Objectives We sought to determine the feasibility and characterize the extinction kinetics of circulating cell-free tumor DNA (cfDNA) testing in endometrial and ovarian carcinomas (ECs, OCs) using a clinically-approved commercially-available assay. Methods Women with suspected EC/OC undergoing surgery were consented for tissue and plasma sampling including pre-operative and serial post-operative draws. Tumour tissue and patient-matched buffy coat was extracted for DNA and sequenced for somatic mutations using FINDIT™ panel assay. Plasma samples were extracted for cfDNA and sequenced using FOLLOWIT™, Illumina platform, and analyzed using Contextual Genomics’s QUALITY NEXUS analysis pipelines. Low-frequency variants were confirmed by digital droplet PCR. Results 44 individuals had sufficient tissue and follow-up for inclusion; 24 ECs (13 endometrioid, 10 high-grade serous (HGS), 1 clear cell(CC)), 18 OCs (17 HGS 1, CC), and 2 synchronous endometrial and ovarian carcinomas. Eight ECs and 15 OC cases were advanced stage (II-IV) with residual disease in 2 ECs and 5 OCs, 8 recurrence events and 3 deaths recorded. Compliance with plasma sampling was high(>95%) when requested in hospital or at routine surveillance visits but dropped to 68% for ‘extra’ study-associated visits. Analysis to date reveals cfDNA was detectable in pre-operative samples of 19 individuals (9 ECs, 10 OCs including 4 early stage) and 6/10 tested post-operatively. Normalization of conventional tumour markers post-operatively took a median of 3mo in contrast to rapid loss of detectable cfDNA. Conclusions cfDNA testing is feasible and may enhance surveillance of endometrial and ovarian carcinomas by reflecting i) volume of disease pre-/post-operatively, ii) response to therapy, and/or iii) recurrence.
Cancer is a genetic disease characterized by the emergence of genetically distinct populations of cells (subclones) through the random acquisition of mutations at the level of single-cells and shifting prevalences at the subclone level through selective advantages purveyed by driver mutations. This interplay creates complex mixtures of tumor cell populations which exhibit different susceptibility to targeted cancer therapies and are suspected to be the cause of treatment failure. Therefore it is of great interest to obtain a better understanding of the evolutionary histories of individual tumors and their subclonal composition. In this thesis we present three methods for the inference of tumor subclonal composition and evolution by the use of bulk and/or single-cell DNA sequencing data. First, we present CTPsingle, a method which aims to infer tumor subclonal composition from single-sample bulk sequencing data. CTPsingle consists of two steps: (i) robust clustering of mutations using beta-binomial mixture modelling and (ii) inference of tumor phylogenies by the use of integer linear programming. On simulated data, we show that CTPsingle is able to infer the purity and the clonality of single-sample tumors with high accuracy even when restricted to a coverage depth as low as ∼ 30×. CTPsingle is currently used to infer clonality as a part of the Evolution and Heterogeneity Working Group of Pan Cancer Analysis of Whole Genomes project where sequencing data of over 2700 tumors are analyzed. Next, we present B-SCITE, the first available computational approach that infers tumor phylogenies from combined single-cell and bulk sequencing data. B-SCITE is a probabilistic method which searches for tumor phylogenetic tree maximizing the joint likelihood of the two data types. Tree search in B-SCITE is performed by the use of customized MCMC search over the space of labeled rooted trees. Using a comprehensive set of simulated data, we show that B-SCITE systematically outperforms existing methods with respect to tree reconstruction accuracy and subclone identification. On real tumor data, mutation histories generated by B-SCITE show high concordance with expert generated trees. In the third part, we introduce PhISCS, the first method which integrates single-cell and bulk sequencing data while accounting for the possible existence of mutations affected by undetected copy number aberrations, as well as mutations for which the commonly used and iii recently debated Infinite Sites Assumption is violated. PhISCS is a combinatorial method and, in contrast to the available alternatives which are mostly based on the probabilistic search schemes, it can provide guarantee of optimality of the reported solutions. We provide two different implementations of PhISCS: (i) the implementation based on the use of integer linear programming and (ii) the implementation based on the use of constraint satisfaction programming. We show that the latter has lower running time on most of the instances that we used to asses the performance of the two implementations. These results suggest that in some applications constraint satisfaction programming might be a viable alternative to commonly used integer linear programming. We also demonstrate the utility of PhISCS in analyzing real sequencing data where it reports more plausible and parsimonious tumor phylogenies than the available alternatives.
Abstract Motivation Despite the remarkable advances in sequencing and computational techniques, noise in the data and complexity of the underlying biological mechanisms render deconvolution of the phylogenetic relationships between cancer mutations difficult. Besides that, the majority of the existing datasets consist of bulk sequencing data of single tumor sample of an individual. Accurate inference of the phylogenetic order of mutations is particularly challenging in these cases and the existing methods are faced with several theoretical limitations. To overcome these limitations, new methods are required for integrating and harnessing the full potential of the existing data. Results We introduce a method called Hintra for intra-tumor heterogeneity detection. Hintra integrates sequencing data for a cohort of tumors and infers tumor phylogeny for each individual based on the evolutionary information shared between different tumors. Through an iterative process, Hintra learns the repeating evolutionary patterns and uses this information for resolving the phylogenetic ambiguities of individual tumors. The results of synthetic experiments show an improved performance compared to two state-of-the-art methods. The experimental results with a recent Breast Cancer dataset are consistent with the existing knowledge and provide potentially interesting findings. Availability and implementation The source code for Hintra is available at https://github.com/sahandk/HINTRA.
Recent technological advances in single cell sequencing (SCS) provide high resolution data for studying intra-tumor heterogeneity and tumor evolution. Available computational methods for tumor phylogeny inference via SCS typically aim to identify the most likely perfect phylogeny tree satisfying infinite sites assumption (ISA). However limitations of SCS technologies such as frequent allele dropout or highly variable sequence coverage, commonly result in mutational call errors and prohibit a perfect phylogeny. In addition, ISA violations are commonly observed in tumor phylogenies due to the loss of heterozygosity, deletions and convergent evolution. In order to address such limitations, we, for the first time, introduce a new combinatorial formulation that integrates single cell sequencing data with matching bulk sequencing data, with the objective of minimizing a linear combination of (i) potential false negatives (due to e.g. allele dropout or variance in sequence coverage) and (ii) potential false positives (due to e.g. read errors) among mutation calls, as well as (iii) the number of mutations that violate ISA - to define the optimal sub-perfect phylogeny. Our formulation ensures that several lineage constraints imposed by the use of variant allele frequencies (VAFs, derived from bulk sequence data) are satisfied. We express our formulation both in the form of an integer linear program (ILP) and - for the first time in the context of tumor phylogeny reconstruction - a boolean constraint satisfaction problem (CSP) and solve them by leveraging state-of-the-art ILP/CSP solvers. The resulting method, which we name PhISCS, is the first to integrate SCS and bulk sequencing data under the finite sites model. Using several simulated and real SCS data sets, we demonstrate that PhISCS is not only more general but also more accurate than the alternative tumor phylogeny inference tools. PhISCS is very fast especially when its CSP based variant is used returns the optimal solution, except in rare instances for which it provides an optimality gap. PhISCS is available at https://github.com/haghshenas/PhISCS.
Cancer develops through a continuous process of somatic evolution. Whole genome sequencing provides a snapshot of the tumor genome at the point of sampling, however, the data can contain information that permits the reconstruction of a tumor9s evolutionary past. Here, we apply such life history analyses on an unprecedented scale, to a set of 2,658 tumors spanning 39 cancer types. We estimated the timing of large chromosomal gains during tumor evolution, by comparing the rates of doubled to non-doubled point mutations within gained regions. Although we find that such events typically occur in the second half of clonal evolution, we also observe distinctive and early chromosomal gains in some cancer types, such as gains of chromosomes 7, 19 and 20 in glioblastoma, and isochromosome 17q in medulloblastoma. By integrating these results with the qualitative timing of individual driver mutations, we obtained an overall ranking, from early to late, of frequent somatic events per cancer type, which both identified novel patterns of tumor evolution, and incorporated additional detail into known models, such as the progression of APC-KRAS-TP53 in colorectal cancer proposed by Vogelstein and Fearon. To estimate how mutational processes acting on the tumor genome change over time, we classified mutations in each sample according to three broad time periods (early clonal, late clonal, and subclonal), and quantified the activity of mutational signatures in each period. Most mutational processes appear to remain remarkably constant, however, certain signatures show clear and consistent changes during clonal evolution. Particularly, mutational signatures associated with exposure to carcinogens, such as smoking and UV light, tend to decrease over time. In contrast, signatures associated with defective endogenous processes, such as APOBEC mutagenesis and defective double strand break repair, show an increase between early and late phases of tumor evolution. Making use of clock-like mutational signatures, we converted mutational time estimates for large events, such as whole genome duplication (WGD), and the emergence of the most recent common ancestor (MRCA), into real time estimates, which allowed us to combine our analyses into overall timelines of cancer evolution, per tumor type. For example, the typical timeline of ovarian adenocarcinoma development shows that early tumor evolution is characterized by mutations in TP53, and widespread genome instability, with WGD events taking place on average 8 years prior to diagnosis. In later stages of evolution, signatures of defective repair processes increase, and the MRCA emerges on average 1 year before diagnosis. Taken together, these data reveal the common and divergent evolutionary trajectories available to a cancer, which might be crucial in understanding specific tumor biology, and in providing new opportunities for early detection and cancer prevention. Citation Format: Clemency Jolly, Moritz Gerstung, Ignaty Leshchiner, Stefan C. Dentro, Santiago Gonzalez, Thomas J. Mitchell, Yulia Rubanova, Pavana Anur, Daniel Rosebrock, Kaixian Yu, Maxime Tarabichi, Amit Deshwar, Jeff Wintersinger, Kortine Kleinheinz, Ignacio Vasquez-Garcia, Kerstin Haase, Subhajit Sengupta, Geoff Macintyre, Salem Malikic, Nilgun Donmez, Dimitri G. Livitz, Mark Cmero, Jonas Demeulemeester, Steve Schumacher, Yu Fan, Xiaotong Yao, Juhee Lee, Matthias Schlesner, Paul C. Boutros, David D. Bowtell, Hongtu Zhu, Gad Getz, Marcin Imielinski, Rameen Beroukhim, S Cenk Sahinalp, Yuan Ji, Martin Peifer, Florian Markowetz, Ville Mustonen, Ke Juan, Wenyi Wang, Quaid D. Morris, Paul T. Spellman, David C. Wedge, Peter Van Loo, PCAWG Evolution and Heterogeneity Working Group. The evolutionary history of 2,658 cancers [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 218.
Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više