Plant Physiology Preview Published on March 9, 2007; 10.1104/pp.107.096677
OPEN ACCESS ARTICLE
Received January 26, 2007
Accepted February 27, 2007
Sampling the Arabidopsis Transcriptome with Massively-Parallel Pyrosequencing
Andreas P.M. Weber , Katrin L. Weber , Kevin Carr , Curtis Wilkerson , and John B. Ohlrogge *
Department of Plant Biology, Michigan State University, East Lansing, MI 48824-1312, USA; Bioinformatic Support Core, Research Technologies Support Facility, Michigan State University, East Lansing, MI 48824, USA
* Corresponding author; email: ohlrogge{at}msu.edu.
Massively-parallel sequencing of DNA by pyrosequencing technology offers much higher throughput and lower cost than conventional Sanger sequencing. Although extensively used already for sequencing of genomes, relatively few applications of massively-parallel pyrosequencing to transcriptome analysis have been reported. To test the ability of this technology to provide unbiased representation of transcripts we analyzed mRNA from Arabidopsis seedlings. Two sequencing runs yielded 541,852 ESTs after quality control. Mapping of the ESTs to the Arabidopsis genome and to TAIR7 cDNA models indicated: 1) massively-parallel pyrosequencing detected transcription of 17,449 gene loci providing very deep coverage of the transcriptome. Performing a second sequencing run only increased the number of genes identified by 10% but increased the overall sequence coverage by 50%. 2) Mapping of the ESTs to their predicted full length transcripts indicated that all regions of the transcript were well represented regardless of transcript length or expression level. Furthermore, short, medium and long transcripts were equally represented. 3) 16,698 of the ESTs that mapped to the genome are not represented in the existing dbEST database. In some cases the ESTs provide the first experimental evidence for transcripts derived from predicted genes and for at least 60 locations in the genome pyrosequencing identified likely protein-coding sequences that are not now annotated as genes. Together the results indicate massively-parallel pyrosequencing provides novel information helpful to improve the annotation of the Arabidopsis genome. Furthermore, the unbiased representation of transcripts will be particularly useful for gene discovery and gene expression analysis of non-model plants with less complete genomic information.
This article has been cited by other articles:

|
 |

|
 |
 
J. C. Marioni, C. E. Mason, S. M. Mane, M. Stephens, and Y. Gilad
RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays
Genome Res.,
September 1, 2008;
18(9):
1509 - 1517.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. C. Dohm, C. Lottaz, T. Borodina, and H. Himmelbauer
Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
Nucleic Acids Res.,
July 26, 2008;
(2008)
gkn425v1.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. R. Berenbaum and A. R. Zangerl
Facing the Future of Plant-Insect Interaction Research: Le Retour a la "Raison d'Etre"
Plant Physiology,
March 1, 2008;
146(3):
804 - 811.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Swarbreck, C. Wilks, P. Lamesch, T. Z. Berardini, M. Garcia-Hernandez, H. Foerster, D. Li, T. Meyer, R. Muller, L. Ploetz, et al.
The Arabidopsis Information Resource (TAIR): gene structure and function annotation
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D1009 - D1014.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. T. Torres, M. Metta, B. Ottenwalder, and C. Schlotterer
Gene expression profiling by massively parallel sequencing
Genome Res.,
January 1, 2008;
18(1):
172 - 177.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. L. Eveland, D. R. McCarty, and K. E. Koch
Transcript Profiling by 3'-Untranslated Region Sequencing Resolves Expression of Gene Families
Plant Physiology,
January 1, 2008;
146(1):
32 - 44.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|