Wheat

From Applied Bioinformatics Group
Revision as of 05:32, 14 November 2011 by Appbio (talk | contribs)
Jump to: navigation, search

Wheat is a major human staple used for the production of bread, pasta, noodles, beer and even ethanol biofuel. The hexaploid genome of bread wheat is extremely large, 16,000 million basepairs (Mbp), making it difficult to characterise and confounding many traditional bioinformatics analysis tools. We are developing bioinformatics tools for the analysis of this complex crop genome with the aim of supporting applied crop research and improvement.

Wheat is probably the most important crop in the world, yet it has one of the most challenging genomes. Bread wheat is a hexaploid, with three complete genomes termed A, B and D in the nucleus of each cell. Each of these genomes is almost twice of the human genome and consists of around 5,500 million letters. Several groups around the world are working towards sequencing wheat. Details of individual efforts can be found on the wiki below.

Genome sequencing projects can be generally divided into whole genome shotgun (WGS) methods or BAC by BAC methods.

WGS attempts to sequence the genome in one go, by generating a large amount of sequence data and then assembling this to produce a representation of the string of letters which make up the genome. WGS has the benefit in that it is quick and relatively inexpensive, but it is often confounded by the inability to stitch the individual sequence reads together, resulting in a poor quality assembly. This is particularly problematic for polyploids, where more than one genome is present in each cell, or where there is a substantial quantity of repetitive sequences. Wheat is a polyploid with 3 genomes, each of which is 80% repetitive, making WGS unattractive.


The alternative BAC by BAC approach requires breaking the genome down to relatively small pieces (c. 120 kbp), ordering these as a minimal tiling path, then sequencing each of the BACs in the tiling path. While sequence assembly or repetitive regions remains problematic, this approach offers the potential to produce the best quality finished genome. However, BAC by BAC sequencing of wheat is hugely expensive, time consuming and is still not guaranteed to produce a complete genome due to some regions being underrepresented in BAC libraries.


We have taken an alternative approach, combining their experience of second generation sequencing technology with the ability of the Dolezel group in the Czech Republic to isolate individual chromosome arms.


Starting with as little as 200 ng of amplified chromosome arm DNA, we have demonstrated that we can produce sequence data and assemble all genes for a specific chromosome arm. By including comparative genomic and molecular genetic marker data, we can produce an assembled sequence representing all known genes, including gene promoters and surrounding genome sequence. This syntenic build is the basis for studies of genome evolution and function with the aim of improving this important crop plant. As we assemble and annotate wheat chromosome arms, they are made publically available using the genome viewer GBrowse2.


Further information on our wheat research is available at wheatgenome.info


Back to Main_Page