Plant Physiol. EPICENTRE Biotechnologies
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


First published online November 12, 2008; 10.1104/pp.108.128926

Plant Physiology 149:111-116 (2009)
© 2009 American Society of Plant Biologists

This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
149/1/111    most recent
pp.108.128926v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Web of Science (1)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Buell, C. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Buell, C. R.
Agricola
Right arrow Articles by Buell, C. R.
Related Collections
Right arrow The Grasses
Update on Poaceae Genomes

Poaceae Genomes: Going from Unattainable to Becoming a Model Clade for Comparative Plant Genomics

C. Robin Buell*

Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824

Genomics has an immense potential for improving our understanding of critical issues in plant growth and development, some of which can be applied to improvement of crop production. Midway into the second decade of genomics, genome and transcriptome sequencing efforts with the Poaceae are impressive given the technical and fiscal challenges presented by the typically large, repetitive genomes found within the Poaceae (Smith and Flavell, 1975Go; Arumuganathan and Earle, 1991Go; SanMiguel et al., 1996Go). Indeed, as of October 30, 2008, there are 10,847,522 Poaceae sequences representing 11,142 Mb (11.1 Gb) in GenBank, confirming the fast pace of sequence generation for the Poaceae. With respect to representation of genera and species within the Poaceae, 2,740 of the approximately 10,000 species reported within the Poaceae (http://www.kew.org/scihort/poaceae.html) have at least one sequence in GenBank. Genome-scale datasets represent a more narrow phylogenetic base and of the 47 Poaceae species with genome or transcriptome sequences (Table I; discussed below), 32 are derived from the BEP clade (Bambusoideae, Ehrhartoideae, Pooideae) and 15 are derived from the PACCMAD clade (Panicoideae, Arundinoideae, Chloridoideae, Centothecoideae, Micrairoideae, Aristidoideae, and Danthonioideae) and represent five subfamilies, nine tribes, and 24 genera within the Poaceae (Fig. 1 ). This article is intended to provide a short introduction to genome sequencing efforts in the Poaceae and an abbreviated report of completed and ongoing genome and transcriptome sequencing efforts for Poaceae species to not only demonstrate the potential of species within the Poaceae for understanding plant biological processes but also to highlight the Poaceae as a model family for comparative genomics.


View this table:
[in this window]
[in a new window]

 
Table I. List of Poaceae species with transcriptome and genome sequence data and/or initiatives

 

Figure 1
View larger version (42K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 1. Phylogeny of grasses for which genome sequence data has been or will be generated in the near future. Asterisk indicates one poorly supported node. Polyploid species for which genome relationships are known are shown to the right of the diploids, with lines indicating ancestry. Most are tetraploid but one (wheat [T. aestivum]) is hexaploid.

 

    EXPRESSED SEQUENCE TAGS: THE START OF THE GENOMICS ERA IN THE POACEAE
 TOP
 EXPRESSED SEQUENCE TAGS: THE...
 GETTING AT THE GENOME:...
 LITERATURE CITED
 
The first genome-scale sequences generated for the Poaceae were ESTs that represent the transcribed portion of a genome and provide a rapid, economic approach to sampling the gene space of an organism. As early access sequence datasets, ESTs can be used for (1) development of genetic markers (for example, see Harushima et al., 1998Go), (2) electronic gene expression analyses (for example, see Ewing et al., 1999Go), (3) improvement of structural annotation (Haas et al., 2003Go), and (4) functional genomic resources for use in overexpression, in vitro expression, and gene-silencing studies. In 1997, the first set of ESTs for a Poaceae species was reported for rice (Oryza sativa; Yamamoto and Sasaki, 1997Go). Now, 11 years later, there are 36 Poaceae species with EST collections greater than 1,000 sequences (Table I); of these, >1 million ESTs are available for three Poaceae species (maize [Zea mays], rice, wheat [Triticum aestivum]). Collectively, as of October 30, 2008, within the dbEST division of GenBank, there were 5,491,939 Poaceae EST sequences totaling 2,764 Mb (2.76 Gb). With the high degree of repetitive sequences within the majority of Poaceae species, coupled with the availability of high-throughput next generation sequencing platforms for transcriptome sequencing (Cloonan et al., 2008Go; Rosenkranz et al., 2008Go), ESTs will continue to provide a rich source of genic sequences for grass researchers and it should be fully envisioned that within the coming years, EST collections will be available for thousands of Poaceae species.


    GETTING AT THE GENOME: GENOME SEQUENCES
 TOP
 EXPRESSED SEQUENCE TAGS: THE...
 GETTING AT THE GENOME:...
 LITERATURE CITED
 
With continued advancements in technology and concomitant reductions in costs over the last decade, whole-genome sequences have been generated for multiple species within the Poaceae (Table I). Rice was not only the first crop species but also the second plant species with a genome sequence (Barry, 2001Go; Goff et al., 2002Go; Yu et al., 2002Go; The International Rice Genome Sequencing Project, 2005Go; Yu et al., 2005Go). Currently, genome sequence is available for two subspecies of rice, indica and japonica. Perhaps most importantly for all current and future Poaceae genome sequences, a high-quality, near-complete genome sequence is available for the Nipponbare cultivar of japonica rice (The International Rice Genome Sequencing Project, 2005Go) that will most likely provide the only gold standard reference genome for the Poaceae for the near future. Indeed, the reference Nipponbare sequence was used to resequence a set of 20 rice lines using hybridization-based sequencing to identify single nucleotide polymorphisms (McNally et al., 2006Go; http://irfgc.irri.org/index.php?option=com_content&task=view&id=14&Itemid=106; http://oryzasnp.plantbiology.msu.edu/). Draft genome sequences are also available for sorghum (Sorghum bicolor), Brachypodium distachyon, and maize with analyses and full public release anticipated in 2009 (Table I). A genome project is planned for foxtail millet (Setaria italica) by the U.S. Department of Energy Joint Genome Institute (Table I). Physical resources in the form of bacterial artificial chromosome (BAC) map clones have been developed for a number of Poaceae species in advance of genome sequencing (Table I). Most notably, the OMAP project (Kim et al., 2008Go) has generated physical maps and BAC end sequences for 12 Oryza species in support of comparative genomics within this important genus.

In addition to transcriptome and whole-genome sequences, large sets of genomic sequences are available within the GSS, HTG, WGS, and PLN divisions of GenBank. Within the GSS division, which includes gene enrichment as well as BAC end sequences, 5,072,454 Poaceae sequences (3,337 Mb) are available. Although the maize and sorghum gene enrichment sequences within the GSS division have now been superceded by draft genome sequences, the gene enrichment approaches of methylation filtration and high Cot were highly successful in generating genic sequences for maize and sorghum, thereby providing early access to the gene space (Palmer et al., 2003Go; Whitelaw et al., 2003Go; Bedell et al., 2005Go). Other Poaceae sequences in GenBank include 179,196 sequences (1,169 Mb) within the PLN division, 18,655 sequences (3,086 Mb) within the HTG division, and 85,280 sequences (785 Mb) within the WGS division.

It should be noted that the majority of the sequence available currently for the Poaceae are derived from a few species of high agricultural importance (rice, maize, wheat, and sorghum). As shown in Figure 2 , although 13 of the 47 species with genome-scale datasets, resources, or initiatives listed in Table I have >100 Mb of total sequence in GenBank, three-quarters of the sequence are from maize or Oryza species reflective of the heavy bias in Poaceae genome sequencing projects to date. However, with access to the next generation of genome sequencing technologies (Margulies et al., 2005Go; Holt et al., 2008Go; Sarin et al., 2008Go), it can certainly be envisaged that researchers will have access to dozens of Poaceae genomes in the near future. Furthermore, application of these next generation sequencing technologies along with techniques to enrich for subfractions of the genome (Albert et al., 2007Go; Hodges et al., 2007Go; Okou et al., 2007Go) will greatly enhance resequencing of additional cultivars or accessions, thereby providing an unlimited set of sequence resources to examine genome diversity at the species level.


Figure 2
View larger version (40K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 2. Genome sequence availability for 47 Poaceae species with genome projects. Sequence was downloaded for all 47 species from GenBank (October, 2008) and summed for all divisions. Thirteen species are represented individually in the pie chart; sequence for 34 species with less than 100 Mb of total sequence in GenBank were grouped into Other.

 
Certainly, this is an exciting time to be engaged in Poaceae research as even if genomics is not your research discipline, access to not just one but multiple Poaceae genome sequences provides not only a robust set of resources for biological inquiries, but also provides a perspective of gene function in a phylogenetic context. With this deluge of genomic sequence data, the storage, handling, analysis, and use of the large-scale genomic sequence and annotation data becomes problematic for most researchers. Consequently, resources, databases, and analyses tools need to be developed to ensure these genome datasets can be used in a feasible and intelligent manner, thereby maximizing the return on the investment of obtaining the genome sequence. Certainly, Poaceae researchers are not alone in forging a path through the morass of genome sequence data in the early 21st century, and the tools, resources, software, and knowledge gained from other genomic research endeavors throughout the Tree of Life will be instrumental in obtaining a full understanding of the pan-Poaceae genome.


    ACKNOWLEDGMENTS
 
Efforts in phylogenetic tree construction by E. Kellogg are greatly appreciated. Work on rice genomics was supported by the National Science Foundation (grant nos. DBI–0321538 and DBI–0834043 to C.R.B.).

Received September 1, 2008; accepted November 5, 2008; published November 12, 2008.


    FOOTNOTES
 
The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: C. Robin Buell (buell{at}msu.edu).

www.plantphysiol.org/cgi/doi/10.1104/pp.108.128926

* E-mail buell{at}msu.edu.


    LITERATURE CITED
 TOP
 EXPRESSED SEQUENCE TAGS: THE...
 GETTING AT THE GENOME:...
 LITERATURE CITED
 
Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, et al (2007) Direct selection of human genomic loci by microarray hybridization. Nat Methods 4: 903–905[CrossRef][Web of Science][Medline]

Arumuganathan K, Earle E (1991) Nuclear DNA content of some important plant species. Plant Mol Biol Rep 9: 208–218[CrossRef]

Barry GF (2001) The use of the Monsanto draft rice genome sequence in research. Plant Physiol 125: 1164–1165[Free Full Text]

Bedell JA, Budiman MA, Nunberg A, Citek RW, Robbins D, Jones J, Flick E, Rholfing T, Fries J, Bradford K, et al (2005) Sorghum genome sequencing by methylation filtration. PLoS Biol 3: e13[CrossRef][Medline]

Cloonan N, Forrest AR, Kolle G, Gardiner BB, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G, et al (2008) Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods 5: 613–619[CrossRef][Web of Science][Medline]

Ewing RM, Ben Kahla A, Poirot O, Lopez F, Audic S, Claverie JM (1999) Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. Genome Res 9: 950–959[Abstract/Free Full Text]

Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296: 92–100[Abstract/Free Full Text]

Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD, et al (2003) Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 31: 5654–5666[Abstract/Free Full Text]

Harushima Y, Yano M, Shomura A, Sato M, Shimano T, Kuboki Y, Yamamoto T, Lin SY, Antonio BA, Parco A, et al (1998) A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148: 479–494[Abstract/Free Full Text]

Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ, Hannon GJ, et al (2007) Genome-wide in situ exon capture for selective resequencing. Nat Genet 39: 1522–1527[CrossRef][Web of Science][Medline]

Holt KE, Parkhill J, Mazzoni CJ, Roumagnac P, Weill FX, Goodhead I, Rance R, Baker S, Maskell DJ, Wain J, et al (2008) High-throughput sequencing provides insights into genome variation and evolution in Salmonella typhi. Nat Genet 40: 987–993[CrossRef][Web of Science][Medline]

Kim H, Hurwitz B, Yu Y, Collura K, Gill N, SanMiguel P, Mullikin JC, Maher C, Nelson W, Wissotski M, et al (2008) Construction, alignment and analysis of twelve framework physical maps that represent the ten genome types of the genus Oryza. Genome Biol 9: R45[CrossRef][Medline]

Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376–380[Medline]

McNally KL, Bruskiewich R, Mackill D, Buell CR, Leach JE, Leung H (2006) Sequencing multiple and diverse rice varieties: connecting whole-genome variation with phenotypes. Plant Physiol 141: 26–31[Free Full Text]

Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME (2007) Microarray-based genomic selection for high-throughput resequencing. Nat Methods 4: 907–909[CrossRef][Web of Science][Medline]

Palmer LE, Rabinowicz PD, O'Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR (2003) Maize genome sequencing by methylation filtration. Science 302: 2115–2117[Abstract/Free Full Text]

Rosenkranz R, Borodina T, Lehrach H, Himmelbauer H (2008) Characterizing the mouse ES cell transcriptome with Illumina sequencing. Genomics 92: 187–194[CrossRef][Web of Science][Medline]

SanMiguel P, Tikhonov A, Jin YK, Motchoulskaia N, Zakharov D, Melake-Berhan A, Springer PS, Edwards KJ, Lee M, Avramova Z, et al (1996) Nested retrotransposons in the intergenic regions of the maize genome. Science 274: 765–768[Abstract/Free Full Text]

Sarin S, Prabhu S, O'Meara MM, Pe'er I, Hobert O (2008) Caenorhabditis elegans mutant allele identification by whole-genome sequencing. Nat Methods 5: 865–867

Smith DB, Flavell RB (1975) Characterisation of the wheat genome by renaturation kinetics. Chromosoma 50: 223–242[Web of Science]

The International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436: 793–800[CrossRef][Medline]

Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, et al (2003) Enrichment of gene-coding sequences in maize by genome filtration. Science 302: 2118–2120[Abstract/Free Full Text]

Yamamoto K, Sasaki T (1997) Large-scale EST sequencing in rice. Plant Mol Biol 35: 135–144[CrossRef][Web of Science][Medline]

Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296: 79–92[Abstract/Free Full Text]

Yu J, Wang J, Lin W, Li S, Li H, Zhou J, Ni P, Dong W, Hu S, Zeng C, et al (2005) The genomes of Oryza sativa: a history of duplications. PLoS Biol 3: e38[CrossRef][Medline]




This article has been cited by other articles:


Home page
Plant Physiol.Home page
E. A. Kellogg and C. R. Buell
Splendor in the Grasses
Plant Physiology, January 1, 2009; 149(1): 1 - 3.
[Full Text] [PDF]


This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
149/1/111    most recent
pp.108.128926v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Web of Science (1)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Buell, C. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Buell, C. R.
Agricola
Right arrow Articles by Buell, C. R.
Related Collections
Right arrow The Grasses


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
ASPB Publications PLANT PHYSIOLOGY® THE PLANT CELL
Copyright © 2009 by the American Society of Plant Biologists