First published online March 9, 2007; 10.1104/pp.107.096677
Plant Physiology 144:32-42 (2007)
© 2007 American Society of Plant Biologists
OPEN ACCESS ARTICLE
BREAKTHROUGH TECHNOLOGIES
Sampling the Arabidopsis Transcriptome with Massively Parallel Pyrosequencing1,[W],[OA]
Andreas P.M. Weber,
Katrin L. Weber,
Kevin Carr,
Curtis Wilkerson and
John B. Ohlrogge*
Department of Plant Biology (A.P.M.W., K.L.W., J.B.O.), and Bioinformatic Support Core, Research Technologies Support Facility (K.C., C.W.), Michigan State University, East Lansing, Michigan 488241312
Massively parallel sequencing of DNA by pyrosequencing technology offers much higher throughput and lower cost than conventional Sanger sequencing. Although extensively used already for sequencing of genomes, relatively few applications of massively parallel pyrosequencing to transcriptome analysis have been reported. To test the ability of this technology to provide unbiased representation of transcripts, we analyzed mRNA from Arabidopsis (Arabidopsis thaliana) seedlings. Two sequencing runs yielded 541,852 expressed sequence tags (ESTs) after quality control. Mapping of the ESTs to the Arabidopsis genome and to The Arabidopsis Information Resource 7.0 cDNA models indicated: (1) massively parallel pyrosequencing detected transcription of 17,449 gene loci providing very deep coverage of the transcriptome. Performing a second sequencing run only increased the number of genes identified by 10%, but increased the overall sequence coverage by 50%. (2) Mapping of the ESTs to their predicted full-length transcripts indicated that all regions of the transcript were well represented regardless of transcript length or expression level. Furthermore, short, medium, and long transcripts were equally represented. (3) Over 16,000 of the ESTs that mapped to the genome were not represented in the existing dbEST database. In some cases, the ESTs provide the first experimental evidence for transcripts derived from predicted genes, and, for at least 60 locations in the genome, pyrosequencing identified likely protein-coding sequences that are not now annotated as genes. Together, the results indicate massively parallel pyrosequencing provides novel information helpful to improve the annotation of the Arabidopsis genome. Furthermore, the unbiased representation of transcripts will be particularly useful for gene discovery and gene expression analysis of nonmodel plants with less complete genomic information.
1 This work was supported by a Strategic Partnership Grant (Next Generation Sequencing Center) of the Michigan State University Foundation (to A.P.M.W. and J.B.O).
The author responsible for distribution of materials integral to the findings presented in this article in accordance with journal policy described in the Instructions for Authors (www.plantphysiol.org) is: John Ohlrogge (ohlrogge{at}msu.edu).
[W] The online version of this article contains Web-only data.
[OA] Open Access articles can be viewed online without a subscription.
www.plantphysiol.org/cgi/doi/10.1104/pp.107.096677
* Corresponding author; e-mail ohlrogge{at}msu.edu; fax 5173531926.
Received January 26, 2007;
accepted February 27, 2007;
published March 9, 2007.
This article has been cited by other articles:

|
 |

|
 |
 
L. Wang, P. Li, and T. P. Brutnell
Exploring plant transcriptomes using ultra high-throughput sequencing
Briefings in Functional Genomics,
February 3, 2010;
(2010):
elp057v1 - elp057.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. A. Filichkin, H. D. Priest, S. A. Givan, R. Shen, D. W. Bryant, S. E. Fox, W.-K. Wong, and T. C. Mockler
Genome-wide mapping of alternative splicing in Arabidopsis thaliana
Genome Res.,
January 1, 2010;
20(1):
45 - 58.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
H. E. Machado, A. A. Pollen, H. A. Hofmann, and S. C.P. Renn
Interspecific profiling of gene expression informed by comparative genomic hybridization: A review and a novel approach in African cichlid fishes
Integr. Comp. Biol.,
December 1, 2009;
49(6):
644 - 659.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. W. Nicol, G. A. Helt, S. G. Blanchard Jr., A. Raja, and A. E. Loraine
The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets
Bioinformatics,
October 15, 2009;
25(20):
2730 - 2731.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Morrissy, R. D. Morin, A. Delaney, T. Zeng, H. McDonald, S. Jones, Y. Zhao, M. Hirst, and M. A. Marra
Next-generation tag sequencing for cancer gene expression profiling
Genome Res.,
October 1, 2009;
19(10):
1825 - 1835.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
N. A. Eckardt
Deep Sequencing Maps the Maize Epigenomic Landscape
PLANT CELL,
April 1, 2009;
21(4):
1024 - 1026.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. C. Dohm, C. Lottaz, T. Borodina, and H. Himmelbauer
Substantial biases in ultra-short read data sets from high-throughput DNA sequencing
Nucleic Acids Res.,
September 1, 2008;
36(16):
e105 - e105.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. C. Marioni, C. E. Mason, S. M. Mane, M. Stephens, and Y. Gilad
RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays
Genome Res.,
September 1, 2008;
18(9):
1509 - 1517.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. Brautigam, S. Hoffmann-Benning, and A. P.M. Weber
Comparative Proteomics of Chloroplast Envelopes from C3 and C4 Plants Reveals Specific Adaptations of the Plastid Envelope to C4 Photosynthesis and Candidate Proteins Required for Maintaining C4 Metabolite Fluxes
Plant Physiology,
September 1, 2008;
148(1):
568 - 579.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
M. R. Berenbaum and A. R. Zangerl
Facing the Future of Plant-Insect Interaction Research: Le Retour a la "Raison d'Etre"
Plant Physiology,
March 1, 2008;
146(3):
804 - 811.
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
D. Swarbreck, C. Wilks, P. Lamesch, T. Z. Berardini, M. Garcia-Hernandez, H. Foerster, D. Li, T. Meyer, R. Muller, L. Ploetz, et al.
The Arabidopsis Information Resource (TAIR): gene structure and function annotation
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D1009 - D1014.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. T. Torres, M. Metta, B. Ottenwalder, and C. Schlotterer
Gene expression profiling by massively parallel sequencing
Genome Res.,
January 1, 2008;
18(1):
172 - 177.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
A. L. Eveland, D. R. McCarty, and K. E. Koch
Transcript Profiling by 3'-Untranslated Region Sequencing Resolves Expression of Gene Families
Plant Physiology,
January 1, 2008;
146(1):
32 - 44.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|