Mutations that accumulate in the genome of cells or viruses can be used to infer their evolutionary history. In the case of rapidly evolving organisms, genomes can reveal their detailed spatiotemporal spread. Such phylodynamic analyses are particularly useful to understand the epidemiology of rapidly evolving viral pathogens. As the number of genome sequences available for different pathogens has increased dramatically over the last years, phylodynamic analysis with traditional methods becomes challenging as these methods scale poorly with growing datasets. Here, we present TreeTime, a Python-based framework for phylodynamic analysis using an approximate Maximum Likelihood approach. TreeTime can estimate ancestral states, infer evolution models, reroot trees to maximize temporal signals, estimate molecular clock phylogenies and population size histories. The runtime of TreeTime scales linearly with dataset size.
The variants of concern (VoCs) of SARS-CoV-2 have highlighted the need for a global molecular surveillance of pathogens via whole genome sequencing. Such sequencing, for SARS-CoV-2 and other pathogens, is performed by an ever increasing number of labs across the globe, resulting in an increased need for an easy, fast, and decentralized analysis of initial data. Nextclade aligns viral genomes to a reference sequence, calculates several quality control (QC) metrics, assigns sequences to a clade or variant, and identifies changes in the viral proteins relative to the reference sequence. Nextclade is available as a command-line tool and as a web application with completely client based processing, meaning that sequence data doesn't leave the user's browser.
Many microbial populations rapidly adapt to changing environments with multiple variants competing for survival. To quantify such complex evolutionary dynamics in vivo, time resolved and genome wide data including rare variants are essential. We performed whole-genome deep sequencing of HIV-1 populations in 9 untreated patients, with 6-12 longitudinal samples per patient spanning 5-8 years of infection. The data can be accessed and explored via an interactive web application. We show that patterns of minor diversity are reproducible between patients and mirror global HIV-1 diversity, suggesting a universal landscape of fitness costs that control diversity. Reversions towards the ancestral HIV-1 sequence are observed throughout infection and account for almost one third of all sequence changes. Reversion rates depend strongly on conservation. Frequent recombination limits linkage disequilibrium to about 100bp in most of the genome, but strong hitch-hiking due to short range linkage limits diversity.DOI: http://dx.doi.org/10.7554/eLife.11282.001
A variant of SARS-CoV-2 emerged in early summer 2020, presumably in Spain, and has since spread to multiple European countries. The variant was first observed in Spain in June and has been at frequencies above 40% since July. Outside of Spain, the frequency of this variant has increased from very low values prior to 15th July to 40-70% in Switzerland, Ireland, and the United Kingdom in September. It is also prevalent in Norway, Latvia, the Netherlands, and France. Little can be said about other European countries because few recent sequences are available. Sequences in this cluster (20A.EU1) differ from ancestral sequences at 6 or more positions, including the mutation A222V in the spike protein and A220V in the nucleoprotein. We show that this variant was exported from Spain to other European countries multiple times and that much of the diversity of this cluster in Spain is observed across Europe. It is currently unclear whether this variant is spreading because of a transmission advantage of the virus or whether high incidence in Spain followed by dissemination through tourists is sufficient to explain the rapid rise in multiple countries.
The genetic diversity of a species is shaped by its recent evolutionary history and can be used to infer demographic events or selective sweeps. Most inference methods are based on the null hypothesis that natural selection is a weak or infrequent evolutionary force. However, many species, particularly pathogens, are under continuous pressure to adapt in response to changing environments. A statistical framework for inference from diversity data of such populations is currently lacking. Towards this goal, we explore the properties of genealogies in a model of continual adaptation in asexual populations. We show that lineages trace back to a small pool of highly fit ancestors, in which almost simultaneous coalescence of more than two lineages frequently occurs. Whereas such multiple mergers are unlikely under the neutral coalescent, they create a unique genetic footprint in adapting populations. The site frequency spectrum of derived neutral alleles, for example, is nonmonotonic and has a peak at high frequencies, whereas Tajima's D becomes more and more negative with increasing sample size. Because multiple merger coalescents emerge in many models of rapid adaptation, we argue that they should be considered as a null model for adapting populations.coalescent theory | demographic inference | pathogen evolution | population genetics E volutionary change is usually too slow to be observed in real time. A sequence sample represents a static snapshot from which we want to learn about a dynamic evolutionary process. The predominant framework to analyze such population genetic data and infer demographic history is Kingman's neutral coalescent. Within this model, all individuals are equivalent (i.e., there are no fitness differences), and pairs of lineages merge at random. The statistical properties of genealogies in this simple population genetic model can be computed exactly (1, 2), facilitating comparison with data. One central prediction of the neutral coalescent is that the genetic diversity of a population is proportional to its size. This prediction, however, is at odds with the observed weak correlation between genetic diversity and population size, a paradox often remedied by the definition of an effective population size proportional to the genetic diversity. The model has been generalized to account for historic changes in population size, mutation rates, geographical structure, and effects of purifying selection (3-7). Positive selection, however, has proved difficult to incorporate, and progress has been limited to rare selective sweeps (8, 9) and weak selection (10).In many populations, particularly large microbial populations, selection is neither rare nor weak. Instead, these populations are under sustained pressure to adapt to changing environments. Prominent examples include pathogens like influenza that continuously evade human immune responses or HIV, which establishes a chronic infection despite heavy immune predation. The genealogical trees reconstructed from sequence samples often suggest substant...
om as , F e r na n d o G on zález Candelas, SeqCOVID-SPAIN consortium, Tanja Stadler & Richard A. NeherThis is a PDF file of a peer-reviewed paper that has been accepted for publication. Although unedited, the content has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text and figures will undergo copyediting and a proof review before the paper is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.
Given a sample of genome sequences from an asexual population, can one predict its evolutionary future? Here we demonstrate that the branching patterns of reconstructed genealogical trees contains information about the relative fitness of the sampled sequences and that this information can be used to predict successful strains. Our approach is based on the assumption that evolution proceeds by accumulation of small effect mutations, does not require species specific input and can be applied to any asexual population under persistent selection pressure. We demonstrate its performance using historical data on seasonal influenza A/H3N2 virus. We predict the progenitor lineage of the upcoming influenza season with near optimal performance in 30% of cases and make informative predictions in 16 out of 19 years. Beyond providing a tool for prediction, our ability to make informative predictions implies persistent fitness variation among circulating influenza A/H3N2 viruses.DOI: http://dx.doi.org/10.7554/eLife.03568.001
Investigator group: The members of the WHO European Region sequencing laboratories and GISAID EpiCoV group are listed at the end of the article
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.