Publication details

Genome assembly and annotation for red clover (Trifolium pratense; Fabaceae)

Authors

IŠTVÁNEK Jan JAROŠ Michal KŘENEK Aleš ŘEPKOVÁ Jana

Year of publication 2014
Type Article in Periodical
Magazine / Source American Journal of Botany
MU Faculty or unit

Faculty of Science

Citation
Doi http://dx.doi.org/10.3732/ajb.1300340
Field Genetics and molecular biology
Keywords assessment of assembly software; de novo assembly; Fabaceae; genome annotation; red clover; Trifolium pratense
Description Red clover (Trifolium pratense ) is an important forage plant from the legume family with great importace in agronomy and livestock nourishment. Nevertheless, assembling its medium-sized genome presents a challenge, given current hardware and software possibilities. Next-generation sequencing technologies enable us to generate large amounts of sequence data at low cost. In this study, the genome assembly and red clover genome features are presented. First, assembly software was assessed using data sets from a closely related species to find the best possible combination of assembler plus error correction program to assemble the red clover genome. The newly sequenced genome was characterized by repetitive content, number of protein-coding and nonprotein-coding genes, and gene families and functions. Genome features were also compared with those of other sequenced plant species. Abyss with Echo correction was used for de novo assembly of the red clover genome. The presented assembly comprises ~314.6 Mbp. In contrast to leguminous species with comparable genome sizes, the genome of T. pratense contains a larger repetitive portion and more abundant retrotransposons and DNA transposons. Overall, 47 398 protein-coding genes were annotated from 64 761 predicted genes. Comparative analysis revealed several gene families that are characteristic for T.pratense. Resistance genes, leghemoglobins, and nodule-specifi c cystein-rich peptides were identifi ed and compared with other sequenced species. The presented red clover genomic data constitute a resource for improvement through molecular breeding and for comparison to other sequenced plant species.
Related projects: