|
|
||||||||
|
First published online August 6, 2004; 10.1104/pp.104.043323 Plant Physiology 136:3023-3033 (2004) © 2004 American Society of Plant Biologists Utility of Different Gene Enrichment Approaches Toward Identifying and Sequencing the Maize Gene Space1,[w]Center for Plant and Microbial Genomics, Department of Plant Biology, University of Minnesota, St. Paul, Minnesota 55108 (N.M.S.); and Donald Danforth Plant Sciences Center, St. Louis, Missouri 63132 (X.X., W.B.B.)
Maize (Zea mays) possesses a large, highly repetitive genome, and subsequently a number of reduced-representation sequencing approaches have been used to try and enrich for gene space while eluding difficulties associated with repetitive DNA. This article documents the ability of publicly available maize expressed sequence tag and Genome Survey Sequences (GSSs; many of which were isolated through the use of reduced representation techniques) to recognize and provide coverage of 78 maize full-length cDNAs (FLCs). All 78 FLCs in the dataset were identified by at least three GSSs, indicating that the majority of maize genes have been identified by at least one currently available GSS. Both methyl-filtration and high-Cot enrichment methods provided a 7- to 8-fold increase in gene discovery rates as compared to random sequencing. The available maize GSSs aligned to 75% of the FLC nucleotides used to perform searches, while the expressed sequence tag sequences aligned to 73% of the nucleotides. Our data suggest that at least approximately 95% of maize genes have been tagged by at least one GSS. While the GSSs are very effective for gene identification, relatively few (18%) of the FLCs are completely represented by GSSs. Analysis of the overlap of coverage and bias due to position within a gene suggest that RescueMu, methyl-filtration, and high-Cot methods are at least partially nonredundant.
Complete genome sequences are a powerful tool being utilized by many biologists. For several model species, such as Saccharomyces cerevisiae, Drosophila melanogaster, Mus musculus, Homo sapiens, Caenorhabditis elegans, and Arabidopsis, a genome sequence of high standards for coverage and accuracy has been elucidated (Goffeau et al., 1996
Maize has an estimated genome size of 2,300 to 2,700 Mb (Arumuganathan and Earle, 1991
Several approaches have been utilized to sequence the maize gene space (Rabinowicz et al., 1999
We collected 64,357 small-insert random maize sequences; 253,138 BAC end sequences; 178,125 RM insertions; 587,371 MF sequences; and 445,286 HC sequences from the National Center for Biotechnology Information (NCBI) sequence repository (Table I) and aligned these to a small set of well-characterized maize full-length cDNAs (FLCs) to evaluate the success of each of these approaches toward tagging and sequencing maize genes. We have used the full-length coding sequences of 78 maize genes to evaluate the frequency of EST and GSS hits as well as the coverage of the sequence. In addition, for a subset of these genes we have used the full-length genomic sequence to verify our results and determine the ability to sequence across introns. We also estimated the frequency of genes for which the GSS sequencing approaches provide upstream untranslated region (UTR) and promoter sequences. Our analyses confirm that MF and HC selected DNA sequences are highly enriched for gene sequences (Palmer et al., 2003
Frequency of EST Sequences per Gene and Coding Sequence Coverage A subset of 70 FLC sequences from our collection of 78 FLCs was used to perform BLASTN searches of the NCBI EST database. (We excluded the Hon genes from this analysis because they are quite highly expressed.) The distribution of the number of ESTs per gene is shown in Table II. The average number of EST sequences recognizing an FLC was 18. Eighty-four percent of the FLCs were represented by five or more EST sequences, while only two FLCs were unrepresented by ESTs. However, since many of the FLCs used in this study were originally identified via EST sequences, this collection may represent a biased assessment of the frequency of maize FLCs represented by ESTs.
Of the 122,606 bp represented in the 70 FLCs, 89,402 bp were covered by EST sequence, indicating 72.9% coverage. Assembling contigs of the EST sequences that aligned to the FLCs showed that 40% of the 70 FLCs could be represented by a single contiguous sequence while the remaining 41 FLCs contained one or more regions of the coding sequence that were not represented by EST sequence. Assembly of all the ESTs that aligned to the 70 FLCs resulted in 99 nonoverlapping contigs. Therefore, on average, each FLC from our sample dataset of 70 genes is represented by 1.41 EST contigs. This statistic indicates that in many cases multiple EST contigs from assemblies such as the TIGR Gene Index actually represent a single gene. Assuming that our set of 70 FLCs is representative of all maize genes represented by ESTs in terms of size, distribution, and expression suggests that the 56,364 maize EST clusters/singletons in the current TIGR gene index may actually represent approximately 40,000 genes.
Results of alignments of the GSSs with the 78 FLCs are presented in Table II. We attempted to identify any systematic bias within the experimental data sets that would tend to over- or underrepresent the true number of GSSs per gene. To examine this, we checked that each GSS was uniquely assigned to a single gene. Five GSSs (out of 1,167) were found to be assigned to two closely related genes: BZ375290 (Nfd103-Nfd107), BZ686390 (Hdt101-Hdt104), BZ753827 (Hdt101-Hdt104), CG288034 (Nfa103-Nfa104), and CG290386 (Hxa102-Hxa103). Our data set has multiple examples of closely related genes (>90% nucleotide identity); however, the finding that 99.6% of GSSs could be unambiguously assigned suggests that incorrect assignment of a GSS to a parologous gene was an uncommon occurrence within our dataset. Furthermore, comparisons of the rate of gene tagging among members of the subset of FLCs/genes for which we had identified all gene family members, against the rate for gene tagging among all FLCs in our study, revealed no obvious difference (data not shown).
A total of 50,877 and 3,480 GSSs used in our analysis were derived from two randomly sequenced small-insert libraries (ZM_3.0_4.0_KB from TIGR and maize random small-insert library from DuPont, respectively; Meyers et al., 2001
A total of 178,125 RM sequences, each corresponding to the insertion site of a transgenic RM element, have been isolated during the course of the maize gene discovery project led by Virginia Walbot (www.mutransposon.org; Raizada, 2003 A total of 481 HC sequences, representing 353 individual HC clones, align to 73 of the 78 FLCs. On average, each gene was represented by 6.2 HC GSSs, which amounts to an average of 3.81 GSSs per kb of coding sequence. A total of 574 MF GSSs, derived from 376 clones, align to 75 of the 78 FLCs. On average, each gene hit by an MF GSS was represented by 7.4 MF sequences, i.e. there was an average of 5.1 GSSs per kb of coding sequence. Figure 1, C and D, shows the hit distribution of the HC and MF GSSs per FLC and the range of values for the number of GSSs per kb of FLC. The frequency of tagging an FLC by either an HC or MF GSS is 0.108% and 0.098%, respectively. Therefore, assuming our test set of 78 full-length maize genes is a faithful representation of the genes in the maize genome in terms of sequence composition, length, and distribution, the proportion of maize genes that have been tagged by MF with 95% confidence is 0.94 ± 0.05 and the proportion of maize genes tagged by an HC sequence is 0.96 ± 0.04 (see "Materials and Methods"). It should be noted that the combination of MF and HC sequences tags all 78 of the sequence set. The total collection of 1,518,959 maize GSSs analyzed contains 1,167 sequences corresponding to parts of the 78 FLCs used in this study (0.077% of the available GSSs correspond to one of the 78 FLCs). Every FLC in the dataset was tagged by at least three GSSs, and 97% of FLCs had five or more GSS hits, while 76% of the FLCs had at least 10 GSS hits (Fig. 2E). The average gene within our dataset was 1,743 bp in length and tagged by 15.0 GSSs. Table I shows the normalized frequency for gene discovery for the different types of GSS libraries. The frequency for gene discovery in randomly sequenced clones was normalized to a value of 1. RM GSSs identified our FLCs at a 2.9-fold higher rate than random sequencing, while HC selection and MF identified the FLCs at a 7- to 8-fold higher rate than random sequencing (Table I). The EST sequencing projects were 24-fold more likely to identify these FLCs than random sequencing, but since most of these genes were initially discovered within EST libraries, we are unable to relate our observed EST hit frequencies to that of an average maize gene.
Comparison of Results from FLC with Genomic Searches For 12 of the 78 FLCs we had also determined the full-length genomic sequence. Consequently, the ability of different sequencing methods to capture noncoding regions of the complete genes associated with 12 of the FLC sequences could be evaluated. Comparisons of the genes and the GSSs was performed using BLASTN, and the GSSs recognizing these gene sequences were catalogued and compared to the GSSs that were found using the FLCs. All 296 GSSs found by searches using the FLCs were also found by using the full-length genomic sequences as the query. However, the full-length genomic sequences identified another 103 GSSs that were not detected by the FLCs. Inspection of 15 of these 103 GSSs revealed that these GSSs were present entirely within introns, while others spanned exon/intron junctions and simply matched too small a region of the cDNA to be considered a valid match by our original criteria.
The BLAST searches using FLCs and genomic sequences revealed that the currently available GSSs do an excellent job of tagging maize genes. While the ability to tag a gene is quite useful to genomic studies, it is critical that the complete sequence of genes be elucidated during sequencing. The coverage of the 78 genes was tested using the alignments of the GSSs to the FLC and genomic sequences. The 12 full-length genomic sequences have a total length of 92,515 bp, of which 60,631 bp (65.5%) is covered by the GSS entries. The total length of the 78 FLC sequences is 135,510 bp, of which 102,028 bp (75.3%) is covered by the available GSSs. The extent of coverage of the FLCs and genomic sequences by each subset of GSSs is illustrated in Figure 2A. Extrapolation from this dataset would suggest that the currently available GSSs are sufficient for providing the sequence of 75% of the coding nucleotides in the maize genome. We sought to determine the extent of coverage of the FLC sequences by GSS contigs. These contigs were determined based on the positions of alignment rather than a computational assembly and, thus, will be more permissive and longer than those assembled by an automated approach. A total of 165 GSS contigs representing the 78 FLCs were assembled. Fifty-seven of the 78 FLCs (73%) were represented by more than one, nonoverlapping GSS contig.
While the combination of MF and HC GSS data tag all 78 members of our set of genes, only 70/78 FLCs in our gene set are tagged by both MF and HC GSS reads. If this gene set is representative of the total maize gene set, then 90% of maize genes reside in the sequence space overlapped by MF and HC, while the remaining 10% are represented only by MF or HC sequences. It is unclear whether this is due to the fact that sampling of the maize genome by MF or HC is not complete or whether this reflects that MF and HC sample overlapping but nonidentical regions of the maize genome, as suggested by Whitelaw et al. (2003)
The number of nucleotides represented within the 78 FLC sequences is 135,501. Of this, 94,987 of the nucleotides are covered by MF and HC clones, with the observed proportions illustrated in Figure 2B. The distribution suggested that the portion of the genome sampled by HC and MF may be partially nonoverlapping. One possibility to examine whether or not MF and HC sample identical sequence space is to estimate what portion of the sequence space is expected to be common to both MF and HC, given a random sampling by both methods, if they sample identical sequence space. If MF and HC sample identical sequence space, then each method should be equivalent in its ability to sample a given nucleotide within that space. Therefore, one could envision that the probability distribution of the number of nucleotides selected by both HC and MF might be approximated by a hypergeometric distribution (Sincich et al., 2002 This model predicts the proportion of nucleotides recovered by random selection by one method (HC) that would be expected to overlap with a similarly random selection by the second method (MF). One possible issue with this modeling is that, experimentally, nucleotides are not strictly independent of one another. Rather, nucleotides that are present within the same sequenced clone are dependent. To approximate the nonrandomness of nucleotides during the cloning procedure, and to simplify the reality that these sequencing reads came from filtered cloning of randomly sheared genomic DNA and thus may overlap at high density, we assumed both a read length of 720 bases and that the reads covered the sampled sequence space in a nonoverlapping fashion. The choice of 720-bp reads is consistent with the average read length of MF and HC sequences generated by the consortium for maize genomics (http://www.tigr.org/tdb/tgi/maize).
Under the aforementioned assumptions, 188 unit reads arranged end-to-end would be required to extend across the lengths of the 78 FLCs tested. The 94,987 nucleotides aligned to MF and HC sequences could likewise be represented within 132 reads, and the 61,805 and 63,842 nucleotides covered individually by MF and HC, respectively, can be represented by 86 and 89 reads, respectively. Furthermore, the number of reads corresponding to the fraction of nucleotides that are sampled by both MF and HC sequences is 43. It is currently unknown if the collection of MF- and HC-derived sequences will cover all of the nucleotides of the 78 test FLCs, or if there is some portion of these that are unavailable to either or both HC and MF cloning methods. In other words, it is not yet clear whether 94,987 nucleotides that MF + HC sample (hypothetically represented by 132 unit reads) of the possible 135,501 (hypothetically represented by 188 unit reads) represent the limits of coverage by both methods or simply reflect the current sequence depth. However, if MF and HC sample identical sequence space, then each method should be equivalent in its ability to sample a given nucleotide within that space, and any of the 132 reads sampled by both HC and MF could therefore be obtained by MF or HC alone. Under the above assumption, the probability distribution for the number of reads selected by both MF and HC is hypergeometric (Sincich et al., 2002
We also addressed the relative distribution of different types of GSSs within the gene length. The length of each FLC was normalized to a value of 100, and the position of the alignment for each EST, GSS, and contig on the normalized scale of 1 to 100 was determined (Fig. 2, C and D). The RM sequences tended to be located near the 5' end of the gene (Fig. 2C), while the randomly sequenced GSSs displayed a more uniform distribution across the length of the gene. This is expected based on the observation that Mutator insertion sites tend to cluster near the 5' UTRs of some genes (R. Meeley, personal communication; Dietrich et al., 2002
Many genome-wide expression studies aim to link regulatory responses in gene-expression levels to cis-acting sequence elements in gene promoters. To date, there is relatively little information about the 5' cis-regulatory sequences of maize genes. While the EST sequences are a rich source of coding sequences, they do not provide information about the 5' cis-regulatory sequences. The ability of the GSSs to provide genomic sequence 5' to the translation start site was tested for a subset of 33 of the genes in our dataset. Iterative BLAST searches were performed to extend the upstream sequence for these 33 genes. For 26 of the 33 genes (78%), at least one GSS that covered the ATG start codon was available (Table II; Fig. 3). The average gene had 891 bp of sequence 5' to the ATG start codon, while the median length of 5' sequence was 586 bp. Seventeen of the 33 genes had at least 500 bp of upstream sequence, 10 of the 33 genes had at least 1 kb of upstream sequence, and the longest extension was 2.2 kb. This sequence is likely to contain 5' UTRs and promoters. We attempted to determine how often a putative PolII promoter recognition sequences could be found using the Softberry TSSP package (Mount Kisco, NY; http://www.softberry.com; Shahmuradov et al., 2003
In this study a set of well-characterized maize FLC sequences was used to assess the utility of the maize EST and GSS sequencing projects for tagging and sequencing maize genes. The FLCs used within this dataset display differing patterns and levels of expression, and, other than the fact that the majority of these genes were originally identified through EST sequences, there does not appear to be any evidence to suggest that these genes will not be representative of the maize genome as a whole. To confirm that these results were not over- or underestimating the number of GSSs/gene we determined the coverage for a subset of these genes with full-length genomic sequences and for a subset of genes from families in which all cross-hybridizing sequences had been identified were used to perform BLASTN searches. No evidence was found for a systematic over- or underrepresentation by querying with the full-length coding sequences. All 78 genes used as queries were tagged by multiple GSSs. The rate of gene discovery by the GSSs indicates that the HC and the MF sequences each tag about 95% of maize genes. Therefore, it is probable that the majority of genes within the maize genome have been identified by one of these two approaches. If our FLC set is representative of all maize FLCs, then approximately 50% of coding base pairs have been sequenced by either HC or MF approaches and approximately 75% of the base pairs are sequenced by the combined GSS approaches. While the majority of maize genes have been identified by EST or GSS approaches, relatively few genes have been completely sequenced using these approaches. The average number of contigs/singletons per gene is 1.4 EST and 2.1 GSS contigs/singletons per gene. In our analysis, 41% of the FLCs were represented by multiple EST contigs, and 73% of the FLCs were represented by multiple GSS contigs. Therefore, further sequencing is necessary to finish the sequencing of the maize gene space and provide a single contiguous sequence for each gene. In addition to identifying coding sequences, the GSSs will also provide introns and promoter sequence information. The majority of genes tested had a GSS that covered the ATG start codon, and promoter sequences could be computationally predicted in half of these genes.
In our study we were also able to address the issue of whether the two major reduced representation GSS approaches used for maize, HC and MF, sequence identical or partially nonoverlapping portions of the maize genome. The finding that many genes were represented by both MF and HC sequences indicates that these two approaches often do overlap in the portion of the genome that they are sampling. Whitelaw et al. (2003) Our data suggest that the current efforts have been very successful toward sequencing the maize gene space. The majority of maize genes have been identified by EST and/or GSSs. However, most of these genes are only partially represented. Further efforts are necessary to provide more substantial coverage of the gene sequences and to begin to link the assembled GSS contigs to the physical and genetic map of maize.
Sequences Used for BLAST Searches Our analyses utilized three sets of nucleic acid sequences to perform BLAST searches at NCBI during the week of November 17, 2003. The first set of sequences was 78 full-length B73 cDNA sequences obtained by cDNA clone sequencing, RACE PCR to extend an EST clone, and reverse transcription (RT)-PCR of genomic sequences that cross-hybridize to a gene of interest (further sequence details, map positions, and expression profiles for many of these genes are available at the www.ChromDB.org Web site). Further analyses were performed on a subset of the 78 genes for which we had cloned and sequenced all members of the gene family. This allowed us to unambiguously attribute the GSS or EST sequences to one gene. In addition, the full-length genomic sequence for 13 of the 78 genes was available, and these sequences were used to verify that all sequences matching the cDNA also matched the genomic sequence. Table II lists the sequences, accession numbers, and relevant attributes for the genes used in this study.
All alignments were performed with BLAST version 2.2.6 available from the NCBI (Altschul et al., 1997
If the proportion of maize genes sampled by HC or MF (p) is equivalent to the proportion of those genes within our sample set successfully tagged by MF or HC reads, then the 95% confidence interval for the population proportion of maize genes sampled by HC or MF is:
The expected number of reads common to both MF and HC (see text) was calculated with a PERL script included in the supplemental material (available at www.plantphysiol.org). The expected number is given by the mean of the hypergeometric distribution. If the observed number is greater (or less) than the expected number, the accumulated probability associated with the departure from the expected mean was calculated.
The hypergeometric distribution probability function is f(x|A,B,n) = (combination(A,x) * combination(B,(n x)))/combination((A + B),n), where x represents any possible integer in the interval of max{0, n B} and min{n, A}; and combination(n1,n2) represents n1!/(n2! *(n1 n2)!) (Sincich et al., 2002
We thank a number of researchers who produced data that enabled this study. Peter Hermanson helped to perform and curate the BLAST searches for this study. Carolyn Napoli has performed the curation and discovery of many of the genes utilized for this study and has assembled this information at www.ChromDB.org. Shawn Kaeppler, Karen Cone, Vicki Chandler, Heidi Kaeppler, and Rich Jorgensen are principal investigators on the National Science Foundation-funded project DBI9975930 who have been involved in the identification and studies of the chromatin genes used for this study. The authors are very appreciative of all of the projects and personnel involved in maize genome sequencing, including the V. Walbot, J. Messing, C. Shubert, Genoplante, P. Schnable, and TIGR projects. Shawn Kaeppler, Peter Tiffin, and Karen Cone read the manuscript and provided valuable suggestions and input. In addition, Robert Meeley provided valuable information about the observed distribution of Mu insertion sites. We are also very thankful for the valuable comments and suggestions provided by several anonymous reviewers. Received March 22, 2004; returned for revision May 27, 2004; accepted June 1, 2004.
1 This work was supported by the National Science Foundation (grant no. DBI0227310 to N.M.S. and grant no. DBI0221536 to W.B.B.).
[w] The online version of this article contains Web-only data. www.plantphysiol.org/cgi/doi/10.1104/pp.104.043323. * Corresponding author; e-mail springer{at}umn.edu; fax 6126251738 * Corresponding author; e-mail bbarbazuk{at}danforthcenter.org; fax 3145871378.
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 33893402 Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796815[CrossRef][Medline] Arumuganathan K, Earle ED (1991) Nuclear DNA content of some important plant species. Plant Mol Biol 42: 251269 Bennetzen JL (1996) The contributions of retroelements to plant genome organization, function and evolution. Trends Microbiol 4: 347353[CrossRef][ISI][Medline]
Bennetzen JL, SanMiguel P, Chen M, Tikhonov A, Francki M, Avramova Z (1998) Grass genomes. Proc Natl Acad Sci USA 95: 19751978
Burr B, Burr FA, Thompson KH, Albertson MC, Stuber CW (1988) Gene mapping with recombinant inbreds in maize. Genetics 118: 519526
C. elegans Sequencing Consortium (1998) Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282: 20122018
Dietrich CR, Cui F, Packila ML, Li J, Ashlock DA, Nikolau BJ, Schnable PS (2002) Maize Mu transposons are targeted to the 5' untranslated region of the gl8 gene and sequences flanking Mu target-site duplications exhibit nonrandom nucleotide composition throughout the genome. Genetics 160: 697716
Fernandes J, Brendel V, Gai X, Lal S, Chandler VL, Elumalai RP, Galbraith DW, Pierson EA, Walbot V (2002) Comparison of RNA expression profiles based on maize-expressed sequence tag frequency analysis and micro-array hybridization. Plant Physiol 128: 896910
Goffeau A, Barrell BG, Bussey H, Davis RW, Dujon B, Feldmann H, Galibert F, Hoheisel JD, Jacq C, Johnston M, et al (1996) Life with 6000 genes. Science 274: 546, 563567
Meyers BC, Tingey SV, Morgante M (2001) Abundance, distribution, and transcriptional activity of repetitive elements in the maize genome. Genome Res 11: 16601676
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA, et al (2000) A whole-genome assembly of Drosophila. Science 287: 21962204
Palmer LE, Rabinowicz PD, O'Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR (2003) Maize genome sequencing by methylation filtration. Science 302: 21152117
Peterson DG, Schulze SR, Sciara EB, Lee SA, Bowers JE, Nagel A, Jiang N, Tibbitts DC, Wessler SR, Paterson AH (2002) Integration of Cot analysis, DNA cloning, and high-throughput sequencing facilitates genome characterization and gene discovery. Genome Res 12: 795807 Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA (1999) Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nat Genet 23: 305308[CrossRef][ISI][Medline] Raizada MN (2003) RescueMu protocols for maize functional genomics. Methods Mol Biol 236: 3758[Medline]
Raizada MN, Nan GL, Walbot V (2001) Somatic and germinal mobility of the RescueMu transposon in transgenic maize. Plant Cell 13: 15871608
SanMiguel P, Tikhonov A, Jin YK, Motchoulskaia N, Zakharov D, Melake-Berhan A, Springer PS, Edwards KJ, Lee M, Avramova Z, et al (1996) Nested retrotransposons in the intergenic regions of the maize genome. Science 274: 765768
Shahmuradov IA, Gammerman AJ, Hancock JM, Bramley PM, Solovyev VV (2003) PlantProm: a database of plant promoter sequences. Nucleic Acids Res 31: 114117 Sincich T, Levine DM, Stephan D (2002) Practical Statistics by Example, Ed 2. Prentice Hall, Upper Saddle River, NJ, pp 1798
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al (2001) The sequence of the human genome. Science 291: 13041351
Walbot V, Petrov DA (2001) Gene galaxies in the maize genome. Proc Natl Acad Sci USA 98: 81638164
Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung, F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, et al (2003) Enrichment of gene-encoding sequences in maize by genome filtration. Science 302: 21182120
Yuan Y, SanMiguel PJ, Bennetzen JL (2002) Methylation-spanning linker libraries link gene-rich regions and identify epigenetic boundaries in Zea mays. Genome Res 12: 13451349 Yuan Y, SanMiguel PJ, Bennetzen JL (2003) High-Cot sequence analysis of the maize genome. Plant J 34: 249255[CrossRef][ISI][Medline] This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY | THE PLANT CELL | |
|---|---|---|---|