|
|
||||||||
|
First published online November 16, 2007; 10.1104/pp.107.108597 Plant Physiology 146:32-44 (2008) © 2008 American Society of Plant Biologists OPEN ACCESS ARTICLE
Transcript Profiling by 3'-Untranslated Region Sequencing Resolves Expression of Gene Families1,[W],[OA]Department of Horticultural Sciences, Plant Molecular and Cellular Biology Program, Genetics Institute, University of Florida, Gainesville, Florida 32611
Differences in gene expression underlie central questions in plant biology extending from gene function to evolutionary mechanisms and quantitative traits. However, resolving expression of closely related genes (e.g. alleles and gene family members) is challenging on a genome-wide scale due to extensive sequence similarity and frequently incomplete genome sequence data. We present a new expression-profiling strategy that utilizes long-read, high-throughput sequencing to capture the information-rich 3'-untranslated region (UTR) of messenger RNAs (mRNAs). Resulting sequences resolve gene-specific transcripts independent of a sequenced genome. Analysis of approximately 229,000 3'-anchored sequences from maize (Zea mays) ovaries identified 14,822 unique transcripts represented by at least two sequence reads. Total RNA from ovaries of drought-stressed wild-type and viviparous-1 mutant plants was used to construct a multiplex cDNA library. Each sample was labeled by incorporating one of 16 unique three-base key codes into the 3'-cDNA fragments, and combined samples were sequenced using a GS 20 454 instrument. Transcript abundance was quantified by frequency of sequences identifying each unique mRNA. At least 202 unique transcripts showed highly significant differences in abundance between wild-type and mutant samples. For a subset of mRNAs, quantitative differences were validated by real-time reverse transcription-polymerase chain reaction. The 3'-UTR profile resolved 12 unique cellulose synthase (CesA) transcripts in maize ovaries and identified previously uncharacterized members of a histone H1 gene family. In addition, this method resolved nearly identical paralogs, as illustrated by two auxin-repressed, dormancy-associated (Arda) transcripts, which showed reciprocal mRNA abundance in wild-type and mutant samples. Our results demonstrate the potential of 3'-UTR profiling for resolving gene- and allele-specific transcripts.
Functional analysis of plant genomes requires methods for resolving differential expression of closely related genes. The ability to distinguish between paralogs (e.g. gene family members) and alleles on a genome-wide scale is key to understanding the genetic basis of quantitative traits in diverse plant populations. Genes with extensive sequence similarity may comprise a significant portion of a given transcriptome. Among maize (Zea mays) inbreds, for example, 90% of the alleles are polymorphic (Wright et al., 2005
Differential expression of related, duplicated genes has been linked to functional diversity within species (Gu et al., 2004
Despite rapid advances in expression profiling techniques, the capacity to distinguish among closely related transcripts on a genome-wide scale remains a challenge. In microarray analyses, cross hybridization of similar transcripts to a given oligonucleotide probe may confound expression of individual genes. With sequencing of several genomes complete (e.g. Arabidopsis [Arabidopsis thaliana], rice [Oryza sativa], and poplar), whole-genome tiling arrays have allowed unbiased interrogation of the transcriptome (Yamada et al., 2003
The emergence of high-volume, short-read sequencing technologies has increased resolution for quantitative transcriptome analysis in organisms for which complete genomic sequence is available. Advances in serial analysis of gene expression (SAGE) have opened transcript profiling to unbiased sampling and quantitative analysis of gene expression (Saha et al., 2002
Alternatively, genome-wide profiling by massively parallel sequencing (Meyers et al., 2004
Longer read lengths are achieved with 454-based pyrosequencing, initially described by Margulies et al. (2005)
Here, we present a strategy that harnesses the specificity and information content of the 3'-UTR in a long-read, 454-based sequencing approach to transcript profiling. A key to this method is the use of 3'-anchored sequence reads long enough for unambiguous identification of closely related transcripts. By targeting the 3'-UTR of mRNAs, an unprecedented resolution is achieved for gene- and allele-specific transcripts, even for genomes that are only partially sequenced or lack extensive EST coverage. In addition, detection of haplotypes containing multiple polymorphisms is facilitated by the longer read length. These components of the transcriptome are thus opened to quantitative analysis beyond that currently accessible with short-tag sequencing technologies. In this work, maize provides an ideal system to assess our 3'-anchored strategy, because the genome is rich in genetic complexity from extensive gene duplication (Messing et al., 2004 In this study, we introduce a 3'-UTR profiling method that allows quantitative analysis of gene-specific expression on a genome-wide level, here using mutant and wild-type maize ovaries. Concurrent sequencing of multiple mRNA samples was enabled by the use of a multiplexing strategy. Results provided quantitative expression profiles with read output evenly distributed between samples. The frequency of 3'-anchored sequence reads aligning to a given cDNA was used to quantify mRNA abundance and to measure differential gene expression. The long read lengths combined with the specificity of the 3'-UTR were sufficient to distinguish individual members of a previously characterized gene family as well as provide quantitative comparisons of closely related transcripts that matched unique maize ESTs or assembled cDNAs. In addition, insertion deletion (indel) polymorphisms were readily detectable by this method and resolved nearly identical paralogous gene products.
3'-cDNA Library Construction
We synthesized 3'-anchored cDNA template libraries to generate gene-specific sequence reads by 454 using the protocol shown in Figure 2
. Concurrent sequencing of up to 16 individual sublibraries is enabled by incorporation of a three-base multiplex key in the A-adaptor (Fig. 2A). By using a subset of 16 three-base keys, we could detect single-base errors in the multiplex key. Addition of a fourth base to the multiplex key would enable up to 64 unique combinations with error detection, thus enhancing the number of concurrently sequenced sublibraries. Each 3'-UTR sublibrary was constructed from total RNA (Fig. 2B) using a modified, biotinylated 454-B adaptor that incorporates an oligo(dT) tail for priming cDNA synthesis from poly(A+) RNA. Following second-strand cDNA synthesis, biotinylated cDNAs were bound to Streptavidin beads, purified by magnetic pull down, and digested with MspI to generate 3'-cDNA fragments with 2-base (CG) overhangs. Specific multiplex A-adaptors were then ligated to the purified 3' fragments. A detailed description is provided in "Materials and Methods." MspI was selected based on simulated digests of 70,000 3'-orientated ESTs of maize (Fig. 3
) from the maize full-length cDNA project (www.maizecdna.org). Predicted tag lengths were used to assess the proportion of 3' enrichment in comparison to rice 3'-UTRs. While the expected size distribution of MspI-digested cDNA fragments is optimal for the Genome Sequencer 20 read length (approximately 100 bases), the longer reads generated by the 454-FLX instrument (approximately 250 base average) would likely extend through the 3'-UTR of many transcripts. If so, the number of FLX reaction cycles could be configured to optimize average read length (Harkins and Jarvie, 2007
To test the 3'-UTR profiling strategy, we sequenced 3'-anchored cDNA sublibraries prepared from immature ovaries of isogenic viviparous-1 (vp1) mutant and wild-type maize plants in a W22 inbred genetic background. Prior to RNA sampling, plants were subjected to a drought stress treatment ("Materials and Methods"). VP1 is a transcription factor that mediates a subset of responses to the plant stress hormone, abscisic acid (ABA), including maturation and onset of seed desiccation tolerance. The classic vp1 phenotype is that of precocious germination due to reduced ABA sensitivity (McCarty et al., 1991
A sequencing reaction on the Genome Sequencer 20 instrument (Margulies et al., 2005
The capacity of these consensus sequences to identify individual genes based on specificity of their 3'-UTRs was tested by aligning these reads to available maize cDNA databases (The Institute for Genomic Research Zea mays Gene Index [ZmGI] and Industry UniGene [IUC]) using BLASTN (cutoff: E < 10–7). At least 87% of the consensus tags matched cDNAs and 66% aligned with a gene-enriched maize genomic assembly (MAGI; Fu et al., 2005
Based on the set of unique consensus sequences obtained from the two-sample library, we developed a graphic display for the quantitative transcriptome profile (Fig. 4, A and B). We quantified gene expression for each of 11,559 consensus sequences that matched unique cDNAs using read frequencies. The results are plotted on a logarithmic scale to capture the full range of expression. The 11,559 3' sequences profiled were also analyzed based on Gene Ontology functional classifications determined by PFam searches derived from ZmGI and IUC databases. Analysis of respective maize cDNAs revealed 5,202 (45%) that were unclassified and lacked annotation based on sequence similarity. An additional 578 (5%) of consensus sequences matched genes having conserved domains of unknown function.
The relationship between abundance of each mRNA and its rank (ordered from least to most prevalent) in the whole dataset approximated a Zipf-power law (ranked slope near –1 on a log-log scale). This distribution was evident among transcripts overall (Fig. 4A) and within individual functional classes (Figs. 4B and 5A ). Zipf's power law relationships are observed in a wide range of natural phenomena, including the distribution of gene expression in a variety of organisms (Kuznetsov et al., 2002
Distinguishing Gene Family Members
To evaluate 3'-UTR profiling for resolution of individual gene family members, we analyzed the cellulose synthase (CesA) gene family (Fig. 5A). The assembled 3'-anchored sequences distinguished 12 unique transcripts representing nine annotated CesA gene family members (Supplemental Table S1) that were previously characterized in maize (Holland et al., 2000
In addition, we analyzed a group of closely related histone H1-like transcripts (Fig. 5B). These transcripts matched a unique, nonredundant set of ESTs from various maize cDNA libraries and were annotated based on sequence similarities in other species. Although these H1 genes have not been individually characterized in maize, BLASTN results provided insight for eventual functional analysis. For example, a very highly expressed H1-like transcript (TC292133a) matched a drought- and ABA-induced gene that had been characterized in tomato (Bray et al., 1999
The use of 3'-UTR profiling as an effective strategy for detecting quantitative differences in transcript abundance between samples was evaluated based on read frequencies generated from individual sublibraries. Read frequencies representing each expressed gene were determined for wild-type and mutant sublibraries by parsing the CAP3 ace file output. We analyzed 4,147 consensus sequences that were represented by a total of 10 or more reads using a
Quantitative differences in levels of specific mRNAs were confirmed for a subset of genes by real-time reverse transcription (RT)-PCR analyses of the wild-type and vp1 mutant samples (Fig. 6 ). Results showed that differences in transcript abundance between wild-type and mutant RNA samples used in 3'-cDNA sublibrary construction paralleled the 454-based expression profiles.
Resolution of Near-Identical Transcripts by Polymorphisms
Analyses of the maize genome have revealed a high frequency of nearly identical paralogs with
Earlier work identified ARDA1 as a potentially important contributor to stress tolerance in hybrid maize (Guo et al., 2004
We conducted a detailed analysis of polymorphisms detected by a preliminary dataset comprised of 1,263 W22 consensus sequences using BLASTN alignment to MAGI4 B73 genomic sequences. We expected that some portion of apparent polymorphisms in consensus sequences ranging from two to 75 reads (56.6%; Supplemental Table S3) was due to sequence errors. To estimate the contribution of sequence errors in the 454 data, we evaluated polymorphisms detected by a subset of 107 cDNA consensus sequences (seven to 75 reads) with respect to B73 MAGI assemblies by independent BLASTN searches of IUC cDNA and public EST databases. We confirmed 93.8% of 146 sequence polymorphisms detected within 52 W22 alleles by identical cDNA matches, indicating that most identify independently documented maize alleles.
Because the pyrosequencing method used by 454 is prone to errors in estimating lengths of long homopolymer runs (Margulies et al., 2005
Our results demonstrate that 3'-UTR profiling is an effective strategy for high-resolution global analysis of gene expression that does not require a complete genome sequence. Using this approach, we were able to identify over 14,000 gene-specific mRNAs and quantify expression based on read frequencies occurring in 3'-anchored consensus sequences. Analysis of the quantitative 3'-UTR profile revealed a dynamic range of gene expression spanning greater than 3 orders of magnitude.
Our strategy of using long-read, 454 sequencing to target gene-specific 3'-UTRs offers several advantages over previous tag-based approaches to global expression profiling. First, depth of sequencing is enhanced by anchoring the 454 reads to unique sites proximal to the 3' ends of transcripts. This eliminates redundancy associated with shotgun sequencing of cDNA fragments, thus providing more reads per unique transcript and reducing the potential for highly expressed mRNAs to saturate the library (Weber et al., 2007 Second, the specificity of these long, 3'-UTR-based sequence reads facilitates unambiguous gene assignment. Our analyses indicated that individual gene family members can be resolved by unique, gene-specific 3'-anchored tags, and the corresponding closely related ESTs can be characterized. Finally, enrichment of 3'-UTR sequences provides a useful source of polymorphic information for studies of natural variation. Identification and analysis of nearly identical paralogous genes is improved on a genome-wide scale by enrichment for polymorphisms in the 3' sequences. Even in cases where genomic information is very limited, high-throughput sequencing of 3'-UTRs from species' variants allows direct comparison of polymorphic loci. This approach thus provides a tool for genotyping and assessing genetic diversity contributing to quantitative traits without the need for a sequenced genome or extensive EST collections.
Approximately 22% of the unique mRNAs identified in this study by at least two reads did not match ESTs in either ZmGI or IUC databases. A similar percentage of novel sequences (30%) were also observed for a transcript profile from maize shoot apical meristem using 454-based shotgun sequencing of sheared cDNAs (Emrich et al., 2007a
Furthermore, quantitative analyses of closely related transcripts can extend studies of functional genomics to species without completely sequenced genomes and where gene families are largely uncharacterized. We addressed this possibility with an analysis of H1-like transcripts in maize ovaries. Although the individual genes have not been characterized in maize, identification of the corresponding H1 ESTs indicated that these unique, nonredundant transcripts are indeed expressed. One highly represented H1 mRNA in maize, TC292133a, was annotated as a drought- and ABA-inducible H1 gene based on sequence similarities in tomato (Bray et al., 1999 For organisms that have limiting cDNA resources, 3'-cDNA tags will be less likely to align with upstream coding sequences, thus constraining functional annotation. Nonetheless, 3'-UTR sequences enable resolution of unique mRNAs and distinguish among closely related transcripts. Quantitative data on transcript abundance is also provided, as well as an open, unbiased sampling of the transcriptome. Where additional cDNA information is available, the 3'-cDNA sequences can be extended by BLASTN alignments. Alternatively, the sequence tags can be used to design primers or probes for screening of cDNA libraries. While the divergence of 3'-UTR sequences facilitates resolution of genes within a genome, it may limit the effectiveness of cross-species comparisons for annotation of transcripts. For example, alignment of maize ovary 3'-cDNA consensus tags to the complete set of rice genes (OsGI) using BLASTN produced matches (expectation score <1e–5) for only 20.6% of the transcripts.
Based on our analysis of SNPs identified within consensus sequences and comparisons with B73 MAGI genomic assemblies (Supplemental Table S1), we confirmed at least 89.9% of polymorphisms independently by identical cDNA matches. These data are consistent with a recent study by Barbazuk et al. (2007)
In addition, our preliminary results indicate that homopolymer base-calling errors will have a minor impact on the ability to analyze polymorphisms in maize cDNAs. Importantly, even where errors of this type occur, the consistency of base calling in reads derived from independent 454 libraries suggests that nonidentical alleles may still be distinguished if they give rise to different consensus sequences. This level of specificity in gene expression analysis is invaluable to uncovering novel variation in polyploid or paleopolyploid genomes (Osborn et al., 2003
Our analysis of expressed CesA gene family members demonstrates the capacity of the approach described to provide quantitative resolution of closely related transcripts. This is achieved by specificity of the 3'-UTRs for individual cDNAs. Cross hybridization of near-identical transcripts often complicates identification of individual gene family members in array-based experiments. Consistent with this, the resolution of three CesA4 transcripts, including a putative splice variant, denotes the complexity within the CesA gene family in maize. Even with the most stringent probe designs, cross hybridization with unknown family members remains a challenge in nonsequenced genomes. With unbiased sampling and sequencing, resolution of tissue and/or temporal-specific transcripts and polymorphic variants will provide functional clues in complex genomes such as maize (Ma et al., 2006 Results from the quantitative 3'-UTR expression profile showed that the Zipf power distribution of gene expression observed across the entire dataset overall was not conserved within the chromatin-related functional class. This group of mRNAs showed a skewed distribution of abundance due mainly to a large number of distinct, highly expressed H3 transcripts. Among these, we identified 67 mRNAs having H3 functional domains, and 39% of the consensus sequences were represented by 100 to 1,000 reads. Results may be due to transcriptional responses to the stress treatment or be specific to the reproductive tissues examined. Validation of differences in transcript abundance for a subset of genes by real-time RT-PCR in RNA samples used for sublibrary construction supports 3'-UTR profiling as a platform for quantitative expression profiling between samples. Furthermore, construction of the 3'-cDNA libraries by this method yielded sequences with very low retrotransposon content and nominal rRNA contamination. In addition, read distribution between multiplexed samples was well balanced. Thus, a multiplexing strategy can be used to concurrently profile multiple samples for increased cost effectiveness. Incorporation of a 4-base error detecting key enables up to 64 unique combinations for individual sample recognition.
Preliminary data generated with the recently upgraded FLX 454 technology (Harkins and Jarvie, 2007
Also, the range of gene expression quantified by the 454-based 3'-UTR profile provides higher resolution in global transcript profiling analyses compared to array-based hybridization experiments. Accordingly, our results include detection of many rare mRNAs as well as quantification of highly abundant transcripts. In contrast, this level of resolution was not observed in initial microarray analyses of the same tissues (A.L. Eveland, unpublished data) due to threshold levels of detection and saturation. Likewise, with array-based interpretation of fold-changes, subtle variations in gene expression are often not detected but can have a significant impact on physiology. A quantitative appraisal of all expressed sequences is thus invaluable to studies of quantitative traits such as heterozygosity (Birchler et al., 2003
With 454-based, long-read sequencing of 3'-UTRs, quantitative profiles for allele-specific inheritance patterns can be generated in the absence of a priori data on polymorphisms. Allelic variants are frequently distinguished by single-feature polymorphisms such as those that marked nearly identical paralogs in this study. Identifying allele-specific differences in gene expression and quantifying parental contributions to complex traits in F1 hybrids are key to understanding genetic mechanisms such as imprinting (Guo et al., 2003
Natural variation can also be assessed with array-based probe sets generated from 3'-anchored sequence reads (Borevitz et al., 2003
By combining the specificity of 3'-UTRs with long-read, high-throughput sequencing, we are able to distinguish expression of newly identified genes and closely related transcripts on a genome-wide scale. This can also be accomplished without reference to a completely sequenced genome. The approach provides an efficient avenue for gene discovery and elucidation of variations in expression that underlie natural variation and contribute to complex genetics of heterosis and imprinting. In addition, 3'-UTR profiling advances studies of comparative and functional genomics by quantitatively resolving expression of gene families and identifying unknown gene family members.
Plant Materials Maize (Zea mays) plants were grown in 14-inch, 7-gallon pots under greenhouse conditions (September to November in Gainesville, FL) at 12-h-light/12-h-dark cycles. Sibling wild-type and vp1 mutant plants in a W22 inbred background were derived from a self-pollinated vp1/+ heterozygous ear. A drought-stress treatment was initiated by gradually withholding water beginning 2 weeks prior to tassel emergence. Soil was covered to restrict water loss by evaporation and pots were weighed at the end of each day to determine water loss to transpiration. Water lost to transpiration was added back. One week after ears first appeared, water was withheld completely. Ears were collected right before silk emergence from wild-type and vp1 mutant plants. Immature ovaries (with pedicels) were hand dissected from equivalent sections of each ear (base-to-mid section), weighed to 50 mg fresh weight (15 ovaries per ear), and frozen in liquid N2.
Tissue was homogenized in TRIzol reagent (Invitrogen) using a FastPrep lysis system (Q-BIOgene). RNA was extracted using standard methods based on protocols from the University of Arizona (www.maizearray.org). Total RNA (5 µg) from wild-type and vp1 mutant ovaries was used for cDNA synthesis (MessageAmp II, Ambion) and primed with 6 pmol biotinylated (T12) B-adaptor (modified from Margulies et al., 2005
Adapter pairs were combined and concentrated to 1 pmol/µL in salt buffer (10 mM Tris, 1 mM EDTA, 50 mM NaCl [pH 8]) and annealed by incremental, –1 degree/min decreases (95°C–4°C, with a 30-min hold at 72°C–71°C). Adaptors (5 pmol) with multiplex keys CAT and AGT were ligated to digested wild-type and mutant cDNA samples, respectively. The 3-base key sequences enabled detection of single-base errors in the multiplex key. Unligated adaptors were removed by washing beads twice with 1x B & W buffer (2.5 mM Tris-HCL, pH 7.5, 0.25 mM EDTA, 0.5 M NaCl) and twice with distilled, deionized water. The desired 5'-A-cDNA-B-3' template strand was eluted with 100 mM NaOH, neutralized, and concentrated on a Qiagen column (Margulies et al., 2005
The expected yield of approximately 3 x 109 template molecules for the combined libraries was confirmed by a SYBR Green Q-PCR strategy (MyiQ, Bio-Rad). Molecules per microliter of amplified product were calculated from an in vitro transcribed (MAXIscript, Ambion)
Quality-trimmed 454 sequences (FASTA format) were filtered for valid key and ligation junction (CGG) sequences at 5' ends, and poly-A tails were trimmed using custom programs written in Java. Validated, trimmed sequences (93% of total reads) were assembled using CAP3 (http://genome.cs.mtu.edu/sas.html). The nonredundant set of consensus cDNA sequences represented by two or more reads (14,822 total assemblies) were annotated by BLASTN searches of cDNA databases for maize. These included ZmGI and IUC, a collection of cDNAs provided by an industry consortium via a user's agreement (http://www.maizeseq.org). Functional classifications of cDNA matches were based on Gene Ontology terms associated with PFam (http://www.sanger.ac.uk/Software/Pfam/) assignments in IUC. In addition, consensus sequences were aligned by BLASTN to MAGI (version 4.0 [http://magi.plantgenomics.iastate.edu/]). All 3'-consensus sequence tags were deposited into dbEST (NCBI).
Real-time RT-PCR was carried out to validate technical replicates of RNA samples used in sublibrary construction. For real-time PCR analysis, cDNA was synthesized from DNaseI-treated (Ambion) total RNA using an oligo(dT) primer (TaqMan Reverse Transcription Reagents, ABI). Real-time PCR was monitored using the MyiQ Single Color Real-Time PCR Detection system (Bio-Rad). Each reaction contained 10 µL of 2x iQ SYBR Green Supermix (Bio-Rad), 1.0 µL of cDNA sample, and 200 nM gene-specific primer in a final volume of 20 µL. All reactions were performed in triplicate. The relative abundance of transcripts was normalized with 18S rRNA control values using Taqman (Ribosomal RNA Control Reagents, ABI) and to the constitutive expression of an CBS domain chloride channel ([2562879] CBS forward, 5'-ATGGATGCTGCTGTTCTCATGCTC-3' and CBS reverse, 5'-ATGGAGTCTCCTGGCGTGCTAC-3'), thaumatin/osmotin ([1321765] Osmotin forward, 5'-TACCGCAGCAGCTGAACAACG-3' and Osmotin reverse, 5'-ATGTTCCGTCGCAGTCGCTAGG-3'), senescence-associated/tetraspannin ([TC299489] Sa forward, 5'-AACGACGAGGACGACCTCTGC-3' and Sa reverse, 5'-AGTTTGATTAAGCGTCACCGCCTCG-3'), chlorophyll a/b-binding protein ([TC299127] Cab forward, 5'-TGTACCCTGGCGGCAGCTTC-3' and Cab reverse, 5'-ATCCACGTACGTACACCCTCTCC-3'), copper transport ATPase/heavy-metal associated ([2562278] Hma forward, 5'-AGCCAAAGCTGACGCCTGATC-3' and Hma reverse, 5'-TCCTGCAAGGGATGTGTTGTTC-3'), Gly-rich protein ([2923887] Grp forward, 5'-ATCAGGTGAAGGATACGGACAAGGTG-3'and Grp reverse, 5'-ACAGGACAAATTACAAGCCTTGCGGTG-3'), dehydrin DHN1/RAB-17 ([TC286791] dehydrin forward, 5'-ACAGCACTGAGCGGCGCCTATAC-3' and dehydrin reverse, 5'-ACGTAGCAGCATAAACAGTACACGGACC-3'). Relative expression levels of ARDA1 and ARDA2 were compared by real-time RT-PCR using SYBR Green and gene-specific primers within and around the 18-bp indel sequence (Arda1 forward, 5'-TACAAGCGGGCGCAGTCG-3', Arda1 reverse, 5'-AGCAAACATGGCCTCTTCACTG-3'; Arda2 forward, 5'-TACAAGCGGGCGCAGTCG-3', Arda2 reverse 5'-TGGCCTGACAGAGACACCG-3'). Sequence data from this article can be found in the GenBank/EMBL data libraries under accession numbers EY950428 through EY965249.
The following materials are available in the online version of this article.
We thank William Farmerie and Regina Shaw of the University of Florida Interdisciplinary Center for Biotechnology Research sequencing core for assistance with sequencing. We thank Susan P. Latshaw (Department of Horticultural Sciences, University of Florida) for adaptor oligo design and Wayne T. Avigne (Department of Horticultural Sciences, University of Florida) for lab and greenhouse support. Received September 3, 2007; accepted October 26, 2007; published November 16, 2007.
1 This work was supported by the National Science Foundation (grant nos. NSF–PGRP–0217552, NSF–PGRP–0077676, and NSF–SGER–0542665). The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Karen E. Koch (kekoch{at}ufl.edu).
[W] The online version of this article contains Web-only data.
[OA] Open Access articles can be viewed online without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.107.108597 * Corresponding author; e-mail kekoch{at}ufl.edu.
Appenzeller L, Doblin M, Barreiro R, Wang H, Niu X, Kollipara K, Carrigan L, Tomes D, Chapman M, Dhugga KS (2004) Cellulose synthesis in maize: isolation and expression analysis of the cellulose synthase (CesA) gene family. Cellulose 11: 287–299[CrossRef][ISI] Bao JY, Lee S, Chen C, Zhang XQ, Zhang Y, Liu SQ, Clark T, Wang J, Cao ML, Yang HM, et al (2005) Serial analysis of gene expression study of a hybrid rice strain (LYP9) and its parental cultivars. Plant Physiol 138: 1216–1231 Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS (2007) SNP discovery via 454 transcriptome sequencing. Plant J 51: 910–918[CrossRef][ISI][Medline] Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K (2007) High-resolution profiling of histone methylations in the human genome. Cell 129: 823–837[CrossRef][ISI][Medline] Baxter CJ, Sabar M, Quick WP, Sweetlove LJ (2005) Comparison of changes in fruit gene expression in tomato introgression lines provides evidence of genome-wide transcriptional changes and reveals links to mapped QTLs and described traits. J Exp Bot 56: 1699–1709 Bhattramakki D, Dolan M, Hanafey M, Wineland R, Vaske D, Register JC III, Tingey SV, Rafalski A (2002) Insertion-deletion polymorphisms in 3' regions of maize genes occur frequently and can be used as highly informative genetic markers. Plant Mol Biol 48: 539–547[CrossRef][ISI][Medline] Birchler JA, Augar DL, Riddle NC (2003) In search of the molecular basis of heterosis. Plant Cell 15: 2236–2239 Blanc G, Wolfe KH (2004) Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell 16: 1679–1691 Borevitz JO, Chory J (2004) Genomics tools for QTL analysis and gene discovery. Curr Opin Plant Biol 7: 132–136[CrossRef][ISI][Medline] Borevitz JO, Liang D, Plouffe D, Chang HS, Zhu T, Weigel D, Berry CC, Winzeler E, Chory J (2003) Large-scale identification of single-feature polymorphisms in complex genomes. Genome Res 13: 513–523 Bray EA, Shih TY, Moses MS, Cohen A, Imai R, Plant AL (1999) Water-deficit induction of a tomato H1 histone requires abscisic acid. Plant Growth Regul 29: 35–46[CrossRef][ISI] Cao X, Costa LM, Biderre-Petit C, Kbhaya B, Dey N, Perez P, McCarty DR, Gutierrez-Marcos JF, Becraft PW (2007) Abscisic acid and stress signals induce Viviparous-1 (Vp1) expression in seed and vegetative tissues of maize. Plant Physiol 143: 720–731 Chen J, Sun M, Lee S, Zhou G, Rowley JD, Wang SM (2002) Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags. Proc Natl Acad Sci USA 99: 12257–12262 Cheung VG, Spielman RS, Ewens KG, Weber TM, Morley M, Burdick JT (2005) Mapping determinants of human gene expression by regional and genome-wide association. Nature 437: 1365–1369[CrossRef][Medline] Cong B, Liu J, Tanksley SD (2002) Natural alleles at a tomato fruit size quantitative trait locus differ by heterochronic regulatory mutations. Proc Natl Acad Sci USA 99: 13606–13611 Cowles CR, Hirschhorm JN, Altshuler D, Lander ES (2002) Detection of regulatory variation in mouse genes. Nat Genet 32: 432–437[CrossRef][ISI][Medline] Emrich SJ, Barbazuk B, Schnable PS (2007a) Gene discovery and annotation using LCM-454 transcriptome sequencing. Genome Res 17: 69–73 Emrich SJ, Li L, Wen TJ, Yandeau-Nelson MD, Fu Y, Guo L, Chou HH, Aluru S, Ashlock DA, Schnable PS (2007b) Nearly identical paralogs: implications for maize (Zea mays L.) genome evolution. Genetics 175: 429–439 Fu Y, Emrich SJ, Guo L, Wen TJ, Ashlock DA, Aluru S, Schnable PS (2005) Quality assessment of maize assembled genomic islands (MAGIs) and large-scale experimental verification of predicted genes. Proc Natl Acad Sci USA 102: 12282–12287 Furusawa C, Kaneko K (2003) Zipf's law in gene expression. Phys Rev Lett 90: 088102[CrossRef][Medline] Gu Z, Rifkin SA, White KP, Li WH (2004) Duplicate genes increase gene expression diversity within and between species. Nat Genet 36: 577–578[CrossRef][ISI][Medline] Guo M, Rupe MA, Danilevskaya ON, Yang X, Hu Z (2003) Genome-wide mRNA profiling reveals heterochronic allelic variation and a new imprinted gene in hybrid maize endosperm. Plant J 36: 30–44[CrossRef][ISI][Medline] Guo M, Rupe MA, Zinselmeier C, Habben J, Bowen BA, Smith OS (2004) Allelic variation of gene expression in maize hybrids. Plant Cell 16: 1707–1716 Harkins T, Jarvie T (2007) Megogenomics analysis using the genome sequencer FLX system. Nat Methods 4: application notes iii–v Helentjaris T, Weber D, Wright S (1988) Identification of the genomic locations of duplicate nucleotide sequences in maize by analysis of restriction fragment length polymorphisms. Genetics 118: 353–363 Holland N, Holland D, Helentjaris T, Dhugga KS, Xoconostle-Cazares B, Delmer DP (2000) A comparative analysis of the plant cellulose synthase (CesA) gene family. Plant Physiol 123: 1313–1323 Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9: 868–877 Jongeneel CV, Delorenzi M, Iseli C, Zhou D, Haudenchild CD, Khrebtukova I, Kuznetsov D, Stevenson BJ, Strausberg RL, Simpson AJG, et al (2005) An atlas of human gene expression from massively parallel signature sequences (MPSS). Genome Res 15: 1007–1014 Kuznetsov VA, Knott GD, Bonner RF (2002) General statistics of stochastic process of gene expression in eukaryotic cells. Genetics 161: 1321–1332 Lu C, Tej SS, Luo S, Haudenchild CD, Meyers BC, Green PJ (2005) Elucidation of the small RNA component of the transcriptome. Science 309: 1567–1569 Ma J, Morrow DJ, Fernandes J, Walbot V (2006) Comparative profiling of sense and antisense transcriptome of maize lines. Genome Biol 7: R22[CrossRef][Medline] Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376–380 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||