Plant Physiology 132:469-484 (2003)
© 2003 American Society of Plant Biologists
RESEARCH PAPERS ON SYSTEMS BIOLOGY/GENOMICS/BIOINFORMATICS
Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping1
Wei Zhu,
Shannon D. Schlueter and
Volker Brendel*
Department of Zoology and Genetics (W.Z., S.D.S., V.B.) and Department of Statistics (V.B.), Iowa State University, Ames, Iowa 500113260
Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of sequence data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped the complete set of 176,915 publicly available Arabidopsis EST sequences onto the Arabidopsis genome using GeneSeqer, a spliced alignment program incorporating sequence similarity and splice site scoring. About 96% of the available ESTs could be properly aligned with a genomic locus, with the remaining ESTs deriving from organelle genomes and non-Arabidopsis sources or displaying insufficient sequence quality for alignment. The mapping provides verified sets of EST clusters for evaluation of EST clustering programs. Analysis of the spliced alignments suggests corrections to current gene structure annotation and provides examples of alternative and non-canonical pre-mRNA splicing. All results of this study were parsed into a database and are accessible via a flexible Web interface at http://www.plantgdb.org/AtGDB/.
Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.102.018101.
1 This work was supported in part by the National Science Foundation (grant no. DBI0110254 to V.B.).
* Corresponding author; e-mail vbrendel{at}iastate.edu; fax 5152946755.
Received November 21, 2002;
returned for revision January 6, 2003;
accepted February 20, 2003.
This article has been cited by other articles:

|
 |

|
 |
 
W. B. Barbazuk, Y. Fu, and K. M. McGinnis
Genome-wide analyses of alternative splicing in plants: Opportunities and challenges
Genome Res.,
September 1, 2008;
18(9):
1381 - 1392.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. A. Schlueter, B. E. Scheffler, S. Jackson, and R. C. Shoemaker
Fractionation of Synteny in a Genomic Region Containing Tandemly Duplicated Genes across Glycine max, Medicago truncatula, and Arabidopsis thaliana
J. Hered.,
March 2, 2008;
(2008)
esn010v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Ner-Gaon, N. Leviatan, E. Rubin, and R. Fluhr
Comparative Cross-Species Alternative Splicing in Plants
Plant Physiology,
July 1, 2007;
144(3):
1632 - 1641.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F.-C. Chen, S.-S. Wang, S.-M. Chaw, Y.-T. Huang, and T.-J. Chuang
Plant Gene and Alternatively Spliced Variant Annotator. A Plant Genome Annotation Pipeline for Rice Gene and Alternatively Spliced Variant Identification with Cross-Species Expressed Sequence Tag Conservation from Seven Plant Species
Plant Physiology,
March 1, 2007;
143(3):
1086 - 1095.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B.-B. Wang and V. Brendel
Genomewide comparative analysis of alternative splicing in plants
PNAS,
May 2, 2006;
103(18):
7175 - 7180.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. J. Delaney, R. Xu, J. Zhang, Q. Q. Li, K.-Y. Yun, D. L. Falcone, and A. G. Hunt
Calmodulin Interacts with and Regulates the RNA-Binding Activity of an Arabidopsis Polyadenylation Factor Subunit
Plant Physiology,
April 1, 2006;
140(4):
1507 - 1521.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Ner-Gaon and R. Fluhr
Whole-Genome Microarray in Arabidopsis Facilitates Global Analysis of Retained Introns
DNA Res,
January 1, 2006;
13(3):
111 - 121.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y.-L. Xiao, S. R. Smith, N. Ishmael, J. C. Redman, N. Kumar, E. L. Monaghan, M. Ayele, B. J. Haas, H. C. Wu, and C. D. Town
Analysis of the cDNAs of Hypothetical Genes on Arabidopsis Chromosome 2 Reveals Numerous Transcript Variants
Plant Physiology,
November 1, 2005;
139(3):
1323 - 1337.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Q. Dong, C. J. Lawrence, S. D. Schlueter, M. D. Wilkerson, S. Kurtz, C. Lushbough, and V. Brendel
Comparative Plant Genomics Resources at PlantGDB
Plant Physiology,
October 1, 2005;
139(2):
610 - 618.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Ayele, B. J. Haas, N. Kumar, H. Wu, Y. Xiao, S. Van Aken, T. R. Utterback, J. R. Wortman, O. R. White, and C. D. Town
Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis
Genome Res.,
April 1, 2005;
15(4):
487 - 495.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Aubourg, V. Brunaud, C. Bruyere, M. Cock, R. Cooke, A. Cottet, A. Couloux, P. Dehais, G. Deleage, A. Duclert, et al.
GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D641 - D646.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. J. Robinson, D. J. Cram, C. T. Lewis, and I. A.P. Parkin
Maximizing the Efficacy of SAGE Analysis Identifies Novel Transcripts in Arabidopsis
Plant Physiology,
October 1, 2004;
136(2):
3223 - 3233.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Iida, M. Seki, T. Sakurai, M. Satou, K. Akiyama, T. Toyoda, A. Konagaya, and K. Shinozaki
Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences
Nucleic Acids Res.,
September 27, 2004;
32(17):
5096 - 5103.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
F. Sterky, R. R. Bhalerao, P. Unneberg, B. Segerman, P. Nilsson, A. M. Brunner, L. Charbonnel-Campaa, J. J. Lindvall, K. Tandre, S. H. Strauss, et al.
A Populus EST resource for plant functional genomics
PNAS,
September 21, 2004;
101(38):
13951 - 13956.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Brembu, P. Winge, M. Seem, and A. M. Bones
NAPP and PIRP Encode Subunits of a Putative Wave Regulatory Protein Complex Involved in Plant Cell Morphogenesis
PLANT CELL,
September 1, 2004;
16(9):
2335 - 2349.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Q. Dong, S. D. Schlueter, and V. Brendel
PlantGDB, plant genome database and analysis tools
Nucleic Acids Res.,
January 1, 2004;
32(90001):
D354 - 359.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Schoof, R. Ernst, V. Nazarov, L. Pfeifer, H.-W. Mewes, and K. F. X. Mayer
MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics
Nucleic Acids Res.,
January 1, 2004;
32(90001):
D373 - 376.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. Fizames, S. Munos, C. Cazettes, P. Nacry, J. Boucherez, F. Gaymard, D. Piquemal, V. Delorme, T. Commes, P. Doumas, et al.
The Arabidopsis Root Transcriptome by Serial Analysis of Gene Expression. Gene Identification Using the Genome Sequence
Plant Physiology,
January 1, 2004;
134(1):
67 - 80.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
B. J. Haas, A. L. Delcher, S. M. Mount, J. R. Wortman, R. K. Smith Jr, L. I. Hannick, R. Maiti, C. M. Ronning, D. B. Rusch, C. D. Town, et al.
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
Nucleic Acids Res.,
October 1, 2003;
31(19):
5654 - 5666.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
W. Zhu and V. Brendel
Identification, characterization and molecular phylogeny of U12-dependent introns in the Arabidopsis thaliana genome
Nucleic Acids Res.,
August 1, 2003;
31(15):
4561 - 4572.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|