First published online August 6, 2004; 10.1104/pp.104.041640
Plant Physiology 135:2040-2045 (2004)
© 2004 American Society of Plant Biologists
GENOME ANALYSIS
Types and Frequencies of Sequencing Errors in Methyl-Filtered and High C0t Maize Genome Survey Sequences1,[w]
Yan Fu,
An-Ping Hsia,
Ling Guo and
Patrick S. Schnable*
Department of Genetics, Development, and Cell Biology (Y.F., L.G., P.S.S.), Department of Agronomy (A.-P.H., P.S.S), Interdepartmental Graduate Programs in Genetics (Y.F., P.S.S.) and Bioinformatics and Computational Biology (L.G., P.S.S.), and Center for Plant Genomics (P.S.S.), Iowa State University, Ames, Iowa 500113650
The Maize Genome Sequencing Consortium has deposited into GenBank more than 850,000 maize (Zea mays) genome survey sequences (GSSs) generated via two gene enrichment strategies, methylation filtration and high-C0t (HC) fractionation. These GSSs are a valuable resource for generating genome assemblies and the discovery of single nucleotide polymorphisms and nearly identical paralogs. Based on the rate of mismatches between 183 GSSs (105 methylation filtration + 78 HC) and 10 control genes, the rate of sequencing errors in these GSSs is 2.3 x 103. As expected many of these errors were derived from insufficient vector trimming and base-calling errors. Surprisingly, however, some errors were due to cloning artifacts. These G C to A T transitions are restricted to HC clones; over 40% of HC clones contain at least one such artifact. Because it is not possible to distinguish the cloning artifacts from biologically relevant polymorphisms, HC sequences should be used with caution for the discovery of single nucleotide polymorphisms or paramorphisms. The average rate of sequencing errors was reduced 6-fold (to 3.6 x 104) by applying more stringent trimming parameters. This trimming resulted in the loss of only 11% of the bases (15,469/144,968). Due to redundancy among GSSs this more stringent trimming reduced coverage of promoters, exons, and introns by only 0%, 1%, and 4%, respectively. Hence, at the cost of a very modest loss of gene coverage, the quality of these maize GSSs can approach Bermuda standards, even prior to assembly.
1 This work was supported by competitive grants from the National Science Foundation Plant Genome Program (award nos. DBI9975868, DBI0121417, and DBI0321711). Support was also provided by the Hatch Act and State of Iowa funds.
[w] The online version of this article contains Web-only data.
www.plantphysiol.org/cgi/doi/10.1104/pp.104.041640.
* Corresponding author; e-mail schnable{at}iastate.edu; fax 5152945256.
Received February 25, 2004;
returned for revision April 2, 2004;
accepted May 31, 2004.
This article has been cited by other articles:

|
 |

|
 |
 
M. A. Gore, M. H. Wright, E. S. Ersoz, P. Bouffard, E. S. Szekeres, T. P. Jarvie, B. L. Hurwitz, A. Narechania, T. T. Harkins, G. S. Grills, et al.
Large-Scale Discovery of Gene-Enriched SNPs
The Plant Genome,
July 1, 2009;
2(2):
121 - 133.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
L. A. Mueller, R. K. Lankhorst, S. D. Tanksley, J. J. Giovannoni, R. White, J. Vrebalov, Z. Fei, J. van Eck, R. Buels, A. A. Mills, et al.
A Snapshot of the Emerging Tomato Genome Sequence
The Plant Genome,
March 1, 2009;
2(1):
78 - 92.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. P. Fellers
Genome Filtering Using Methylation- Sensitive Restriction Enzymes with Six Base Pair Recognition Sites
The Plant Genome,
November 1, 2008;
1(2):
146 - 152.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. J. Emrich, L. Li, T.-J. Wen, M. D. Yandeau-Nelson, Y. Fu, L. Guo, H.-H. Chou, S. Aluru, D. A. Ashlock, and P. S. Schnable
Nearly Identical Paralogs: Implications for Maize (Zea mays L.) Genome Evolution
Genetics,
January 1, 2007;
175(1):
429 - 439.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Bruggmann, A. K. Bharti, H. Gundlach, J. Lai, S. Young, A. C. Pontaroli, F. Wei, G. Haberer, G. Fuks, C. Du, et al.
Uneven chromosome contraction and expansion in the maize genome
Genome Res.,
October 1, 2006;
16(10):
1241 - 1251.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
G. Haberer, S. Young, A. K. Bharti, H. Gundlach, C. Raymond, G. Fuks, E. Butler, R. A. Wing, S. Rounsley, B. Birren, et al.
Structure and Architecture of the Maize Genome
Plant Physiology,
December 1, 2005;
139(4):
1612 - 1624.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. D. Rabinowicz, R. Citek, M. A. Budiman, A. Nunberg, J. A. Bedell, N. Lakey, A. L. O'Shaughnessy, L. U. Nascimento, W. R. McCombie, and R. A. Martienssen
Differential methylation of genes and repeats in land plants
Genome Res.,
October 1, 2005;
15(10):
1431 - 1440.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Fu, S. J. Emrich, L. Guo, T.-J. Wen, D. A. Ashlock, S. Aluru, and P. S. Schnable
Quality assessment of maize assembled genomic islands (MAGIs) and large-scale experimental verification of predicted genes
PNAS,
August 23, 2005;
102(34):
12282 - 12287.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|