Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs). This approach is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two to three fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species with high levels of genetic diversity. The GBS procedure is demonstrated with maize (IBM) and barley (Oregon Wolfe Barley) recombinant inbred populations where roughly 200,000 and 25,000 sequence tags were mapped, respectively. An advantage in species like barley that lack a complete genome sequence is that a reference map need only be developed around the restriction sites, and this can be done in the process of sample genotyping. In such cases, the consensus of the read clusters across the sequence tagged sites becomes the reference. Alternatively, for kinship analyses in the absence of a reference genome, the sequence tags can simply be treated as dominant markers. Future application of GBS to breeding, conservation, and global species and population surveys may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.
Flowering time is a complex trait that controls adaptation of plants to their local environment in the outcrossing species Zea mays (maize). We dissected variation for flowering time with a set of 5000 recombinant inbred lines (maize Nested Association Mapping population, NAM). Nearly a million plants were assayed in eight environments but showed no evidence for any single large-effect quantitative trait loci (QTLs). Instead, we identified evidence for numerous small-effect QTLs shared among families; however, allelic effects differ across founder lines. We identified no individual QTLs at which allelic effects are determined by geographic origin or large effects for epistasis or environmental interactions. Thus, a simple additive model accurately predicts flowering time for maize, in contrast to the genetic architecture observed in the selfing plant species rice and Arabidopsis.
Genotyping by sequencing (GBS) is a next generation sequencing based method that takes advantage of reduced representation to enable high throughput genotyping of large numbers of individuals at a large number of SNP markers. The relatively straightforward, robust, and cost-effective GBS protocol is currently being applied in numerous species by a large number of researchers. Herein we describe a bioinformatics pipeline, tassel-gbs, designed for the efficient processing of raw GBS sequence data into SNP genotypes. The tassel-gbs pipeline successfully fulfills the following key design criteria: (1) Ability to run on the modest computing resources that are typically available to small breeding or ecological research programs, including desktop or laptop machines with only 8–16 GB of RAM, (2) Scalability from small to extremely large studies, where hundreds of thousands or even millions of SNPs can be scored in up to 100,000 individuals (e.g., for large breeding programs or genetic surveys), and (3) Applicability in an accelerated breeding context, requiring rapid turnover from tissue collection to genotypes. Although a reference genome is required, the pipeline can also be run with an unfinished “pseudo-reference” consisting of numerous contigs. We describe the tassel-gbs pipeline in detail and benchmark it based upon a large scale, species wide analysis in maize (Zea mays), where the average error rate was reduced to 0.0042 through application of population genetic-based SNP filters. Overall, the GBS assay and the tassel-gbs pipeline provide robust tools for studying genomic diversity.
Maize genetic diversity has been used to understand the molecular basis of phenotypic variation and to improve agricultural efficiency and sustainability. We crossed 25 diverse inbred maize lines to the B73 reference line, capturing a total of 136,000 recombination events. Variation for recombination frequencies was observed among families, influenced by local (cis) genetic variation. We identified evidence for numerous minor single-locus effects but little two-locus linkage disequilibrium or segregation distortion, which indicated a limited role for genes with large effects and epistatic interactions on fitness. We observed excess residual heterozygosity in pericentromeric regions, which suggested that selection in inbred lines has been less efficient in these regions because of reduced recombination frequency. This implies that pericentromeric regions may contribute disproportionally to heterosis.
Accelerating crop improvement in sorghum, a staple food for people in semiarid regions across the developing world, is key to ensuring global food security in the context of climate change. To facilitate gene discovery and molecular breeding in sorghum, we have characterized ∼265,000 single nucleotide polymorphisms (SNPs) in 971 worldwide accessions that have adapted to diverse agroclimatic conditions. Using this genome-wide SNP map, we have characterized population structure with respect to geographic origin and morphological type and identified patterns of ancient crop diffusion to diverse agroclimatic regions across Africa and Asia. To better understand the genomic patterns of diversification in sorghum, we quantified variation in nucleotide diversity, linkage disequilibrium, and recombination rates across the genome. Analyzing nucleotide diversity in landraces, we find evidence of selective sweeps around starch metabolism genes, whereas in landrace-derived introgression lines, we find introgressions around known height and maturity loci. To identify additional loci underlying variation in major agroclimatic traits, we performed genome-wide association studies (GWAS) on plant height components and inflorescence architecture. GWAS maps several classical loci for plant height, candidate genes for inflorescence architecture. Finally, we trace the independent spread of multiple haplotypes carrying alleles for short stature or long inflorescence branches. This genome-wide map of SNP variation in sorghum provides a basis for crop improvement through marker-assisted breeding and genomic selection.Sorghum bicolor | quantitative trait locus | adaptation
BackgroundGenotyping by sequencing, a new low-cost, high-throughput sequencing technology was used to genotype 2,815 maize inbred accessions, preserved mostly at the National Plant Germplasm System in the USA. The collection includes inbred lines from breeding programs all over the world.ResultsThe method produced 681,257 single-nucleotide polymorphism (SNP) markers distributed across the entire genome, with the ability to detect rare alleles at high confidence levels. More than half of the SNPs in the collection are rare. Although most rare alleles have been incorporated into public temperate breeding programs, only a modest amount of the available diversity is present in the commercial germplasm. Analysis of genetic distances shows population stratification, including a small number of large clusters centered on key lines. Nevertheless, an average fixation index of 0.06 indicates moderate differentiation between the three major maize subpopulations. Linkage disequilibrium (LD) decays very rapidly, but the extent of LD is highly dependent on the particular group of germplasm and region of the genome. The utility of these data for performing genome-wide association studies was tested with two simply inherited traits and one complex trait. We identified trait associations at SNPs very close to known candidate genes for kernel color, sweet corn, and flowering time; however, results suggest that more SNPs are needed to better explore the genetic architecture of complex traits.ConclusionsThe genotypic information described here allows this publicly available panel to be exploited by researchers facing the challenges of sustainable agriculture through better knowledge of the nature of genetic diversity.
Whereas breeders have exploited diversity in maize for yield improvements, there has been limited progress in using beneficial alleles in undomesticated varieties. Characterizing standing variation in this complex genome has been challenging, with only a small fraction of it described to date. Using a population genetics scoring model, we identified 55 million SNPs in 103 lines across pre-domestication and domesticated Zea mays varieties, including a representative from the sister genus Tripsacum. We find that structural variations are pervasive in the Z. mays genome and are enriched at loci associated with important traits. By investigating the drivers of genome size variation, we find that the larger Tripsacum genome can be explained by transposable element abundance rather than an allopolyploid origin. In contrast, intraspecies genome size variation seems to be controlled by chromosomal knob content. There is tremendous overlap in key gene content in maize and Tripsacum, suggesting that adaptations from Tripsacum (for example, perennialism and frost and drought tolerance) can likely be integrated into maize.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.