|
|
||||||||
|
First published online December 12, 2008; 10.1104/pp.108.129874 Plant Physiology 149:961-980 (2009) © 2009 American Society of Plant Biologists OPEN ACCESS ARTICLE
Mapping Metabolic and Transcript Temporal Switches during Germination in Rice Highlights Specific Transcription Factors and the Role of RNA Instability in the Germination Process1,[W],[OA]Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley, Western Australia 6009, Australia (K.A.H., R.N., A.C., A.I., A.H.M., J.W.); and Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm, Germany (M.L., B.U.)
Transcriptome and metabolite profiling of rice (Oryza sativa) embryo tissue during a detailed time course formed a foundation for examining transcriptional and posttranscriptional processes during germination. One hour after imbibition (HAI), independent of changes in transcript levels, rapid changes in metabolism occurred, including increases in hexose phosphates, tricarboxylic acid cycle intermediates, and -aminobutyric acid. Later changes in the metabolome, including those involved in carbohydrate, amino acid, and cell wall metabolism, appeared to be driven by increases in transcript levels, given that the large group (over 6,000 transcripts) observed to increase from 12 HAI were enriched in metabolic functional categories. Analysis of transcripts encoding proteins located in the organelles of primary metabolism revealed that for the mitochondrial gene set, a greater proportion of transcripts peaked early, at 1 or 3 HAI, compared with the plastid set, and notably, many of these transcripts encoded proteins involved in transport functions. One group of over 2,000 transcripts displayed a unique expression pattern beginning with low levels in dry seeds, followed by a peak in expression levels at 1 or 3 HAI, before markedly declining at later time points. This group was enriched in transcription factors and signal transduction components. A subset of these transiently expressed transcription factors were further interrogated across publicly available rice array data, indicating that some were only expressed during the germination process. Analysis of the 1-kb upstream regions of transcripts displaying similar changes in abundance identified a variety of common sequence motifs, potential binding sites for transcription factors. Additionally, newly synthesized transcripts peaking at 3 HAI displayed a significant enrichment of sequence elements in the 3' untranslated region that have been previously associated with RNA instability. Overall, these analyses reveal that during rice germination, an immediate change in some metabolite levels is followed by a two-step, large-scale rearrangement of the transcriptome that is mediated by RNA synthesis and degradation and is accompanied by later changes in metabolite levels.
Germination is a series of events that begins with imbibition, the uptake of water by the dry seed, followed by reinitiation of metabolic processes, elongation of the embryonic axis, and, by strict definition, terminates when part of the embryo emerges from the structures that surround it (Bewley, 1997
Historically, regulation of germination has been described by the antagonistic interaction of the phytohormones abscisic acid (ABA) and GA, whereby ABA represses germination and GA promotes germination (Bewley, 1997
Although seed development and germination have been studied for several decades, recent advances in our understanding of these complex processes have largely resulted from the expansion of available sequence data and the establishment of large-scale -omics technologies. In particular, for the dicot model, Arabidopsis, a number of studies utilizing transcriptomic, proteomic, and metabolomic methods to investigate seed maturation, dormancy, and maturation have been published (Nakabayashi et al., 2005
In comparison, there is a relative paucity of similar studies in monocots, particularly at the whole genome level, with respect to transcriptomic and metabolomic studies. While some transcriptome studies in wheat (Triticum aestivum) and barley have been performed (Watson and Henry, 2005
Rice is an important food crop and is the first crop to have its genome sequenced, making it the model of choice for grass species. Several conditions established rice as the optimal choice for global germination analysis in monocots: (1) the availability of whole genome sequence information; (2) an established growth system for studying germination (Howell et al., 2006
Transcriptome and Metabolite Profiling of Early Stages of Rice Germination
We have previously characterized changes in water content and metabolic activity in rice embryos during germination up to 48 HAI and have observed the expected triphasic mode of water uptake with concomitant increases in oxygen uptake (Howell et al., 2006
Metabolite analysis was performed on the same samples used for microarray analysis and additional samples collected 6 and 48 HAI. A total of 126 unique metabolites were detected in the rice embryo samples, and of these, 66 could be identified based on matching to previously run standards (Supplemental Table S2A). Statistical analysis of metabolite abundance revealed that most (93%) of the 126 metabolites detected showed significant (P < 0.05) changes in abundance between at least two time points sampled during the time course, and of the 66 metabolites identified, all were found to show significant changes in abundance (Fig. 1; Supplemental Table S2B). Although a number of significant changes in metabolite abundance were observed just 1 HAI (25 of the 126 metabolites displayed significant changes between 0 and 1 HAI), the differences between 1, 3, and 6 HAI were more subtle compared with the large changes in the transcriptome observed from 1 to 3 HAI (Fig. 1; Supplemental Table S2B). In contrast, large overall changes in metabolite profiles were observed from 12 HAI onward (Fig. 1), with more than 50 of the metabolites displaying a significant change in abundance at 12 HAI or later (Fig. 1; Supplemental Table S2A). Overall, examination of the changes in metabolites and transcripts revealed a rapid change in metabolite levels within 1 HAI, which preceded the large changes in transcript abundance at 3 and 12 HAI and was followed by further changes in metabolites at 12 HAI or later.
The striking changes in metabolite levels that occurred just 1 HAI were predominantly associated with major carbohydrate metabolism (Fig. 2A
; Supplemental Table S2B). Fru-6-P, Glc-6-P, and glycerate-3-phosphate increased 7- to 42-fold between 0 and 1 HAI and were also observed to increase at all time points thereafter. Other metabolites found to rapidly increase included the tricarboxylic acid (TCA) cycle intermediates 2-oxoglutarate, aconitate, fumarate, malate, and succinate, with increases ranging from 2.7- to over 16-fold. This suggests that there is an immediate increase in the activity of glycolysis and the TCA cycle that facilitates early, energy-demanding processes. While most of the changes in amino acids were seen to occur later in the time course,
To compare these metabolite patterns with profiles observed for the 24,150 transcripts detected during rice germination, transcript abundance data were normalized to the highest value for each transcript and then hierarchically clustered, resulting in four main types of transcript profile patterns (Fig. 2B). Cluster 1 represents just under one-third of all expressed genes and is characterized by transcripts that have relatively low and stable levels at early stages of germination and then increase over the time course examined. Cluster 1 was subdivided into four subgroups (A–D) based on when the increase in transcript abundance was observed: cluster 1A increases from 12 to 24 HAI; cluster 1B increases from 3 to 12 HAI, followed by decreases from 12 to 24 HAI; cluster 1C increases between 3 and 12 HAI and remains high at 24 HAI; and cluster 1D increases after 1 or 3 HAI (Fig. 2B). Cluster 2 (black) was unique in that the transcript abundance profiles peaked in abundance after 1 or 3 HAI and then decreased to low levels again from 12 HAI (Fig. 2B), suggesting that a distinct regulatory process has occurred that transiently affects the transcript abundance of over 2,000 genes. Furthermore, the transient but dramatic increases in the transcripts that constitute cluster 2 precede the majority of the increases in transcripts observed for cluster 1, which occur at 12 HAI or later (Fig. 2B). Cluster 3 (pink) is defined by transcripts that decrease throughout the time course of the study, representing almost one-third of all expressed genes, and can be divided into two subgroups: 3A, in which the profiles showed a general decrease in transcript abundance, starting after 1 HAI and continuing to 12 HAI; and 3B, in which the decrease was not as dramatic as that observed for cluster 3A. Cluster 4 (blue) comprises just over one-quarter of all expressed genes and showed relatively constant transcript levels across the time course (Fig. 2B). Interestingly, cluster 1C and cluster 3A are practically mirror images, in that they both include around 3,500 genes, and while cluster 1C shows an increase at 3 HAI, cluster 3A displays a corresponding decrease.
To understand the significance of these distinct patterns of transcript abundance and their relationship to the metabolome changes, three types of analysis were conducted that each provided a different insight into a molecular understanding of the germination process in rice. The first analysis was performed using the PageMan (Usadel et al., 2006
The above analyses reveal the characteristics of statistically significant changes between successive time points. Thus, it primarily gives insights into the changes that are occurring in clusters 1 and 3, where large fold changes of many transcripts are occurring. It is not informative for sets of genes that do not change (i.e. cluster 4) and also may miss some changes that occur in clusters with smaller numbers of genes (i.e. cluster 2). Thus, a second analysis approach was carried out on changes in transcripts based on all transcript profiles (i.e. the 24,150-gene set; Fig. 2B) and the functional categories of the encoded proteins. Differences were determined by calculating z-scores to test if the percentage of a particular category was significantly higher or lower (P < 0.01) than in the whole genome (Fig. 5 ; Supplemental Fig. S4; Supplemental Table S3). Cluster 2, characterized by transient increases in abundance at early stages of germination (1 and 3 HAI), was found to contain a significantly higher proportion of transcripts encoding transcription factors and proteins involved in signal transduction and was underrepresented in several categories of metabolism (Fig. 5). Cluster 4, which displayed relatively constant profiles over the 24-h time period, was found to have a higher proportion of transcripts associated with translation as well as protein folding, sorting, and degradation, which suggests a consistent requirement of the proteins involved in these functions (Fig. 5; Supplemental Fig. S4; Supplemental Table S3). Furthermore, for transcripts comprising clusters 1 and 3, the findings of the PageMan/MapMan analysis were supported by this type of approach.
Third, sequential changes in metabolic organelle function (plastids, mitochondria, and peroxisomes) were investigated during germination. In order to determine if transcripts that encode organelle proteins changed in a coordinated manner compared with that observed for all transcripts (Fig. 2B), subsets of transcripts for genes encoding organelle proteins were reanalyzed by clustering analysis. Four clusters could be clearly defined for transcripts that encoded proteins located in mitochondria, plastids, and peroxisomes based on similar temporal changes in transcript abundances (Fig. 6A ; Supplemental Fig. 5) compared with what was observed when the abundances of all transcripts of the array defined as present were clustered using identical parameters. Functional categorization analysis of the organelle cluster sets was used to determine which categories were overrepresented or underrepresented in each cluster (Fig. 6B). For both mitochondrial and plastid sets, transcripts encoding proteins involved in energy were found to be overrepresented in cluster 1 and underrepresented in cluster 3 (mitochondrial) and cluster 4 (plastid). This enrichment of energy functions in cluster 1 correlates with the requirement for large amounts of energy in the early stages of germination. For the plastid set, transcripts associated with protein synthesis were also overrepresented in cluster 1, while transcripts associated with protein fate were underrepresented (Fig. 6B). This may correspond with the order of processes that occur in organelles that have their own genome. Previous studies investigating mitochondrial biogenesis during rice germination revealed that the transcripts encoding for import components (protein fate) appear first, followed by the other organelle-localized proteins, which can only enter the organelle via these import components (Howell et al., 2006
We have previously suggested a sequential assembly of mitochondria during germination based on the examination of a limited number of genes (Howell et al., 2006 subunit of the pyruvate dehydrogenase complex and cytochrome c), one isoform increased over the time period examined while another decreased (Supplemental Table S5B), suggesting that there may be a switch in the isoform utilized during seed maturation versus germination processes. In combination, these three analysis approaches revealed an almost immediate change in the metabolome, followed by a two-step large-scale rearrangement of the transcriptome featuring metabolic organelle biogenesis and followed by increases in amino acids and components involved in cell wall and carbohydrate metabolism. However, this analysis does not explain what the switch or driver was for these phases in the germination process.
The above analysis of overrepresented and underrepresented functional categories revealed that transcription factors are underrepresented in clusters 1 and 4 (i.e. transcript levels that increase or remain stable) but are overrepresented in clusters 2 and 3 (i.e. transcript levels that increase only transiently or decrease; Fig. 5; Supplemental Table S3). Given that the transcription factors in cluster 3 are characterized as having their highest levels in the dry seeds before decreasing over the germination period, these putatively represent regulators involved in processes associated with seed maturation and desiccation. These transcripts appear to be stored in the dry seeds and then decay as the germination process proceeds. However, the transcription factors contained in cluster 2 are at low levels in the dry seeds and are only transiently expressed at 1 or 3 HAI and, thus, may represent an important regulatory switch that may then drive the changes in transcript abundance that occur later, particularly with respect to increases in transcript abundance represented in cluster 1.
Given these interesting observations, we performed further analysis on the rice transcription factor set. A comprehensive list of rice transcription factors was collated from various databases and studies (as described in "Materials and Methods"), and it was found that transcripts for 1,786 of these were detected in at least one time point of this study. Their transcript profiles were analyzed by hierarchical clustering (Fig. 7A
), and for each cluster type, the proportions of the different transcription factor families were analyzed (Fig. 7B; Supplemental Table S4B). Interestingly, it was found that there was a bias in the types of transcription factors occurring in each cluster. Cluster 1 had a higher proportion of AUX/IAA and basic helix-loop-helix families, while cluster 4 was enriched in the SET family transcription factors (Fig. 7B). These findings are consistent with the observation that members of the AUX/IAA family have previously been associated with GA and auxin signaling pathways during germination in barley (Sreenivasulu et al., 2008
In contrast, cluster 2, characterized by a transient peak in expression at 3 HAI before decreasing, was found to be enriched in AP2-EREBP and WRKY family members (Fig. 7B). AP2 family members are known to play an important role in ABA signaling and in water uptake/drought response, with mutants of an AP2-EREBP family member in Arabidopsis showing increased water loss (Song et al., 2005
Transcription factors identified as belonging to cluster 2 (Fig. 7A) may mediate changes in transcript levels observed later in the time course (i.e. increases in the transcript abundance observed for over 6,000 genes in cluster 1). To determine if these were specific to the process of germination, we analyzed the expression of transcription factors across publicly available rice Affymetrix microarray data. These included analyses of over 30 microarrays from different tissues and stress treatments, and following normalization, all data were made relative to maximum expression across all arrays (detailed in "Materials and Methods"). The 117 transcription factors comprising cluster 2A were then examined closely across the compiled normalized data from this study and the public data, and 34 transcription factors were identified that reached at least 70% of maximum expression levels at 1 or 3 HAI when all available rice array data were analyzed (Fig. 8
). Interestingly, nine of these were found to be exclusively expressed at these early time points during germination and were absent or at very low levels across all the other arrays analyzed (Fig. 8A, yellow boxes). It is important to point out that three of these belonged to the AP2-EREBP family, which further supports our conclusion on the importance of this family in regulating water uptake and ABA signaling specifically during germination. Furthermore, it was interesting that of the 1,786 transcription factors, only two belonged to the AB13/VP1-2 family, and both of these fell into the group of nine genes expressed, almost uniquely, during germination. ABI3/VP1 family members are known to play a role as intermediaries in regulating ABA-responsive genes (Lazarova et al., 2002
By searching for Arabidopsis homologs of the "germination-specific" transcription factors identified in this study and verifying their expression profiles using the eFP browser (Winter et al., 2007
Over 17,000 transcripts were observed in dry seeds, and over the germination time course, more than 18,000 of the 24,150 transcripts present in total were found to significantly change in abundance. A number of peaks in transcript abundance were observed, at 1 and 3 HAI (cluster 2) and 12 HAI (cluster 1B), while some transcripts present in dry seeds were observed to decrease (cluster 3; Fig. 2B). These changes occurred within a 24-h period, suggesting several regulatory steps. In order to uncover the regulatory processes that caused these changes, searches for the presence of common sequence elements in the promoter regions or 3' untranslated regions (UTRs) were carried out. As outlined in "Materials and Methods," 10 sets of genes varying in number from five to 90 were examined for sequence elements (Supplemental Table S6A). Ten sets of these genes peaked at one time point, where a peak was defined as having a transcript abundance of 1.0 (100%) at the peak time point (0, 1, 3 12, or 24 HAI) with less than 50% transcript abundance at all other times examined. Groups examined included the mitochondrial (3 and 24 HAI), plastid (3, 12, and 24 HAI), and transcription factors (0, 1, 3 12, and 24 HAI) sets (Supplemental Table S6, B and C). The mitochondrial and plastid sets did not have any transcripts that "peaked" at 0 and 1 HAI (and 12 HAI for the mitochondrial set). Searches identified a number of conserved elements in each group (Supplemental Table S6B). The transcription factor set that peaked in expression at 3 HAI contained two elements that occurred in all 51 genes. As might be expected, there was also some overlap between the elements that occurred in the different groups that peaked at the same time, and these are reflected in the color of the elements that contain a common core sequence (Supplemental Table S6C). For example, for the transcripts peaking at 3 HAI (in the plastid and transcription factors sets), the corresponding genes were found to contain the helix-turn-helix and BBr/BPC/ARF elements. Five distinct core sequence elements were found to occur within the different groups (above), indicated by color (two related elements in purple), with variations or reverse complements shown (Supplemental Table S6C). Elements that occurred in 70% or more of the sequences from the sets above (Supplemental Table S6, B and C) were taken and searched in the larger genome sets according to expression criteria (i.e. peak expression at one time point and less than 50% at all other time points; Table I ). Sequence elements in the 1-kb promoter region were found to be significantly enriched at all time points except 0 HAI (Table I). Transcripts that peaked at 24 HAI contained six elements that were significantly underrepresented and six that were overrepresented, and of these, three were unique to this time point (Table I; Supplemental Table S6D). Transcripts that peaked at 3 HAI had seven elements overrepresented, the greatest number of elements overrepresented in any group, and one element underrepresented (Supplemental Table S6D). Interestingly, two of the elements overrepresented at 3 HAI were underrepresented at 24 HAI.
When analyzing changes in transcript abundance, it is important to consider the role of mRNA degradation, particularly when it is evident that dramatic decreases in transcript abundance are occurring for large groups of transcripts after they peak in expression. In order to systematically investigate the role of mRNA decay during germination, 3' UTRs were examined for enrichment of motifs in transcript subsets that showed peak expression at 3, 12, and 24 HAI. The presence of these predicted motifs (Supplemental Table S6C) and of 12 known RNA stability/instability-associated motifs was compared between the subsets and the "whole genome" set; however, this was somewhat restricted due to the fact that only 3,027 genes have an annotated 3' UTR in rice (Supplemental Table S6D). Nevertheless, a clear picture emerged, in that four elements were only significantly enriched in 3 HAI, two of which have been associated previously with RNA instability in Arabidopsis (Narsai et al., 2007
This study provides a comprehensive profile of the transcriptome and metabolites during germination in the monocot model rice. A series of temporal switches in metabolites and transcripts is suggested that results in a reactivation of cellular metabolism to support growth. At the earliest time point analyzed in this study, 1 HAI, there was a greater proportion of the detected metabolites than the detected transcripts changing in abundance relative to the total number of changes observed throughout the time course of this study. These early responses were then followed by the largest change in transcript abundances between 3 and 12 HAI, followed by relatively small changes in transcripts at subsequent time points. In contrast, changes in a large number of metabolites continued up to 48 HAI. This suggests that the early changes in metabolites arise from the activity of preexisting enzymes, as this occurs rapidly, possibly even before the energy-demanding process of translation has been fully activated to synthesize new proteins. However, the later changes in metabolites are more likely driven by transcription and translation, as they occur subsequent to changes in transcript abundance. Furthermore, the changes in transcript abundance that appear transitory in nature, defined in cluster 2, which are enriched in transcription factors but underrepresented in transcripts that encode proteins involved in metabolism, may represent a transition from the dormant state to an active growth state. The peak in transcripts in cluster 2 precedes the increase in abundance of approximately 8,000 transcripts (Fig. 2, cluster 1) but occurs after the decrease in abundance for approximately 8,000 transcripts (Fig. 2, cluster 3). Similar transient peaks in transcript profiles were also observed in a study of germination in Arabidopsis (Nakabayashi et al., 2005
A comparison of our rice data with barley seed germination also reveals similarities, with transcripts encoding components involved in sugar, starch, and lipid metabolism being up-regulated, followed by those involved in photorespiration and photosynthesis (Sreenivasulu et al., 2008
Upon imbibition, there is an immediate increase in hexose sugars and organic acids that is already statistically significant at 1 HAI (Fig. 2A). It has been previously shown that upon imbibition of rice seeds, there is an immediate increase in water uptake and oxygen consumption in the 1st h, and protein uptake into isolated mitochondria can occur within 30 min of imbibition (Howell et al., 2006
A number of analyses of transcription factors revealed similarities with previous studies and give insights into the regulatory processes that occur during germination. Transcription factors preferentially expressed in the germinating embryo of barley, such as ARF, AUX/IAA, C2C2-GATA, and C3H-ARFs, were also observed here in rice. This study reveals a greater resolution of these events. Thus, for cluster 3, enriched in the PHD and HSF transcription factor families, and cluster 4, enriched in SET, it can be seen that these transcription factors are present in dry seeds and decrease or remain largely unchanged, respectively (Fig. 7, A and B). In contrast, the transient cluster 2 is enriched in AP2-EREBP and WRKY, while cluster 1 is enriched in the AUX/IAA family and basic helix-loop-helix. Therefore, despite all clusters containing members from several transcription factor families, there is a clear and significant difference in the proportion of families in each cluster, implying an important time-specific regulatory requirement for the expression of these transcription factors. Examination of the transcription factors that peak in expression at 3 HAI (Supplemental Table S4A) reveals that seven of these are AP2-EREBP transcription factors and two are C2H2 zinc finger transcription factors. Previous studies in Arabidopsis have characterized a role for members of the AP2-EREBP family in the regulation of water uptake (Song et al., 2005
The transcription factors in common with germination and anoxia profiles may also be significant, given that germinating seeds are thought to suffer from oxygen deficit (Bewley, 1997
Approximately 17,000 transcripts are stored in the dry rice embryo during seed development and maturation, compared with approximately 12,000 stored in both barley and Arabidopsis seeds (Nakabayashi et al., 2005 The combination of a specifically timed up-regulation of a suite of specific transcription factors and the degradation of both stored and early-induced mRNAs based on 3' UTR sequences appear to be key elements in the coordination of at least some groups of transcripts during the early events in rice germination. These events appear to operate in a coordinated fashion with the induction of primary metabolic pathways, the biogenesis of organelles, and the establishment of the full metabolic profile in the germinating rice embryo.
Rice Growth
Dehulled, sterilized rice seeds (Oryza sativa Amaroo) were grown under aerobic conditions in the dark at 30°C as described previously (Howell et al., 2006
Total RNA was isolated from rice embryos as described previously (Howell et al., 2006
Transcriptomic analysis was performed using Affymetrix GeneChip Rice Genome Arrays (Affymetrix), and three biological replicates were analyzed for each time point. RNA quality was verified using an Agilent Bioanalyzer (Agilent Technologies) and spectrophotometric analysis (NanoDrop ND-1000; NanoDrop Technologies) to determine concentration and the A260-A280 and A260-A230 ratios. Preparation of labeled copy RNA from 2 to 3 µg of total RNA, target hybridization, as well as washing, staining, and scanning of the arrays were carried out exactly as described in the Affymetrix GeneChip Expression Analysis Technical Manual, using the Affymetrix One-Cycle Target Labeling and Control Reagents, an Affymetrix GeneChip Hybridization Oven 640, an Affymetrix Fluidics Station 450, and an Affymetrix GeneChip Scanner 3000 7G at the appropriate steps. Data quality was assessed using GCOS 1.4 (Affymetrix) before CEL files were imported into Avadis 4.3 (Strand Genomics) for further analysis. Raw intensity data were initially normalized using the MAS5 algorithm allowing probe identifications called present to be determined. Only those probe sets that were called present in at least two out of three replicates in at least one time point were included for further analysis. Ambiguous probe sets and bacterial controls were also removed, resulting in a final data set of 24,150-gene set. All microarray data have been deposited in the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/) under the accession code E-MEXP-1766.
Using the 24,150-gene set, probe intensities were analyzed using the GC-RMA algorithm and log transformed, and differential expression analysis was performed with P value correction (Benjamini and Hochberg, 1995
PageMan (Usadel et al., 2006
The transcription factor list was generated using three main sources: DRTF (Gao et al., 2006 When a transcript was annotated to a particular localization, a "source number" was assigned to represent the source used to determine this localization. The source numbers were representative as follows: 1, localization based on experimental evidence; 2, two of the four primary sources agreed on localization (i.e. cutoffs were met in at least two primary sources); 3, three out of four primary sources agreed on localization; and 4, all four of the primary sources agreed on localization. For some transcripts, there was only information from one primary source; therefore, the cutoffs for some sources were raised to maintain stringency. Thus, transcripts with a source number between 7 and 9 represent transcripts for which there was only information from one of the four primary sources with numbers assigned as follows: 7, these transcripts had >70% identity with the orthologous gene in Arabidopsis with known localization (for peroxisomes, this cutoff was allowed to be lowered to >50%, as the prediction programs and other sources did not provide equivalent coverage for detecting peroxisomal genes); 8, for these transcripts, three of the four GO-related localization sources were annotated to be in the same localization (for peroxisomes, two out of four was sufficient); 9, at least four of the seven predictors agreed on localization. For peroxisomes, only one predictor was sufficient, as most of the prediction programs did not even have peroxisome as a choice of localization; therefore, the PTS1 Predictor default cutoff was deemed to be a sufficiently stringent. The source number 10 shows that none of the sources produced any conclusive organelle localization information, even at the lowered standards, while a source number of 11 indicates that one or more of the cutoff criteria were met but the localization based on these methods was conflicting between sources.
For each probe set, the GO annotations and transcript assignment were as retrieved from Affymetrix. The National Science Foundation rice microarray database was used to match each Affymetrix probe identifier to a National Science Foundation accession identifier and to a TIGR locus identifier. These TIGR locus identifiers were then entered into the TIGR rice database, and the putative function of the encoded proteins was derived (Yuan et al., 2005
The z-scores were then matched to the cumulative standard normal table, and the P values were determined.
In order to examine transcript abundance changes across different tissues under different conditions and compare these with the germination transcript abundance profiles generated from this study, rice array data were retrieved from the Gene Expression Omnibus within the National Center for Biotechnology Information database. All data were MAS5.0 normalized and normalized against average ubiquitin expression for that array. These normalized array data were then compiled together, and for each probe set, the maximum expression was set to 1.0 with all other data relative to this. This normalization allowed cross-comparison of arrays from all of the different studies at once. The arrays analyzed included all of the arrays from this study, together with publicly available rice genome arrays carried out from different tissues/conditions, including 7-d-old seedlings that were untreated, drought stressed, salt stressed, or cold stressed (GSE6901; Jain et al., 2007
Following expression analysis, distinct groups of transcripts appeared that showed peak expression at single specific time points within the time course. In order to study these coexpressed transcripts more closely, all 1-kb upstream regions of the 24,150 transcripts were retrieved, and these upstream regions were examined for putative cis-acting elements. Programs designed to detect sequence elements generally have limits of less than 80 input sequences; thus, the list was distilled to uncover sequence elements that may be central to the regulatory processes that cause the changes in transcriptome observed. A "peak" was defined as a probe set having an expression value of 1.0 at that specific time point with expression levels of less than 0.5 at all other time points. Three main cis-element databases were used for this analysis. The first was the Rice Cis-Element Search database (Doi et al., 2008
The full genome 3' UTR and 5' UTR sequences are available from TIGR. This was downloaded and filtered to retain only the 3' UTRs. However, this only added up to 3,027 UTRs available for the "whole genome." Taking this small number into consideration, it was not feasible to look at the organelle-specific and transcription factor peaking subsets analyzed for the promoter regions, as these lists were too small. Thus, for the 3' UTR, the genes peaking in expression at 0, 1, 3, 12, and 24 HAI in the entire genome set were analyzed; however, there were still too few in the 0- and 1-HAI peaking subsets, so these could not be analyzed (Table I). In order to look at the enrichment of motifs in an objective manner, only the MEME Web server was used, as we were not searching for known regulatory elements. The settings were set to search for five motifs that are 6 to 8 bp (default) in each of the subsets, and the outputs are shown at the bottom of Supplemental Table S6D. It is important to note that setting the output to be five motifs can result in false present calls for motifs that are not significant when the input list is small; therefore, only the significantly enriched motifs (present in 60%–70% of all input sequences) were included for further analysis (Supplemental Table S6, C and D). In addition to these putative predicted motifs, 12 motifs known to be associated with RNA stability/instability were examined for their presence in the genome (Table I; Supplemental Table S6D). Ten of these were motifs predicted to be associated with stability/instability of mRNA (Narsai et al. 2007
Data for the 126 nonredundant metabolites were analyzed by two-way differential comparisons to determine fold changes and associated P values, and the number of metabolites significantly changing were also visualized by heat map. The heat map showing the number of significantly changing metabolites was generated using Partek Genomics suite software, version 6.3.
Metabolites were extracted and derivatized using a method modified from that of Roessner-Tunali et al. (2003)
Derivatized metabolite samples were analyzed on an Agilent GC/MSD system composed of an Agilent GC 6890N gas chromatograph (Agilent Technologies) fitted with a 7683B Automatic Liquid Sampler (Agilent Technologies) and 5975B Inert MSD quadrupole MS detector (Agilent Technologies). The gas chromatograph was fitted with a 0.25-mm (i.d.), 0.25-µm film thickness, 30-m Varian FactorFour VF-5ms capillary column with 10 m integrated guard column (Varian; product no. CP9013). GC-MS run conditions were essentially as described for GC-quadrupole-MS metabolite profiling on the Golm Metabolome Database Web site (http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/analytic/gmd_meth.html; Kopka et al., 2005
Raw GC-MS data files in the proprietary ChemStation (.D) format were exported to generic NetCDF/AIA (.CDF) format with ChemStation GC/MSD Data Analysis software (Agilent Technologies). The NetCDF files produced were then processed using in-house MetaMiner software (A. Carroll and A.H. Millar, unpublished data) to carry out all peak detection, quantification, library matching, normalization, statistical analysis, and data visualization. Raw data processing in MetaMiner consisted of the following steps: retrieval of all extracted ion chromatograms (EICs), detection and integration of peaks in EICs, calculation of internally calibrated retention indices for all extracted peaks, matching of carefully selected analyte-specific EIC peaks to analytes in a custom mass spectral-retention index (MSRI) library of known and unknown metabolite derivatives (retention index error < 3 retention index units; Wagner et al., 2003
The following materials are available in the online version of this article.
We thank Ian Castleden from the Centre for Computational Systems Biology for help with the multiple localization predictions and sequence matching. Received September 12, 2008; accepted December 3, 2008; published December 12, 2008.
1 This work was supported by the Australian Research Council Centre of Excellence (grant no. CEO561495) and an Australian Research Council Australian Professorial Fellowship to A.H.M.
2 These authors contributed equally to the article.
3 Present address: Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: James Whelan (seamus{at}cyllene.uwa.edu.au).
[W] The online version of this article contains Web-only data.
[OA] Open Access articles can be viewed online without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.108.129874 * Corresponding author; e-mail seamus{at}cyllene.uwa.edu.au.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410[CrossRef][Web of Science][Medline] Bailey TL, Williams N, Misleh C, Li WW (2006) MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 34: W369–W373 Bassel GW, Fung P, Chow TF, Foong JA, Provart NJ, Cutler SR (2008) Elucidating the germination transcriptional program using small molecules. Plant Physiol 147: 143–155 Benjamini Y, Hochberg Y (1995) Controlling false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc Ser B Methodological 57: 289–300 Bewley JD (1997) Seed germination and dormancy. Plant Cell 9: 1055–1066[CrossRef][Web of Science][Medline] Boden M, Hawkins J (2005) Prediction of subcellular localization using sequence-biased recurrent networks. Bioinformatics 21: 2279–2286 Borisjuk L, Macherel D, Benamar A, Wobus U, Rolletschek H (2007) Low oxygen sensing and balancing in plant seeds: a role for nitric oxide. New Phytol 176: 813–823[CrossRef][Web of Science][Medline] Caldana C, Scheible WR, Mueller-Roeber B, Ruzicic S (2007) A quantitative RT-PCR platform for high-throughput expression profiling of 2500 rice transcription factors. Plant Methods 3: 7[CrossRef][Medline] Carrera E, Holman T, Medhurst A, Peer W, Schmuths H, Footitt S, Theodoulou FL, Holdsworth MJ (2007) Gene expression profiling reveals defined functions of the ATP-binding cassette transporter COMATOSE late in phase II of germination. Plant Physiol 143: 1669–1679 Chen H, Huang N, Sun Z (2006) SubLoc: a server/client suite for protein subcellular location based on SOAP. Bioinformatics 22: 376–377 Conte MG, Gaillard S, Lanau N, Rouard M, Perin C (2008) GreenPhylDB: a database for plant comparative genomics. Nucleic Acids Res 36: D991–D998 Dardick C, Chen J, Richter T, Ouyang S, Ronald P (2007) The rice kinase database: a phylogenomic database for the rice kinome. Plant Physiol 143: 579–586 Doi K, Hosaka A, Nagata T, Satoh K, Suzuki K, Mauleon R, Mendoza MJ, Bruskiewich R, Kikuchi S (2008) Development of a novel data mining tool to find cis-elements in rice gene promoter regions. BMC Plant Biol 8: 20[CrossRef][Medline] Dure L, Waters L (1965) Long-lived messenger RNA: evidence from cotton seed germination. Science 147: 410–412 Emanuelsson O, Brunak S, von Heijne G, Nielsen H (2007) Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protocols 2: 953–971[CrossRef][Medline] Emanuelsson O, Nielsen H, von Heijne G (1999) ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci 8: 978–984[Web of Science][Medline] Fait A, Angelovici R, Less H, Ohad I, Urbanczyk-Wochniak E, Fernie AR, Galili G (2006) Arabidopsis seed development and germination is associated with temporally distinct metabolic switches. Plant Physiol 142: 839–854 Fait A, Fromm H, Walter D, Galili G, Fernie AR (2008) Highway or byway: the metabolic role of the GABA shunt in plants. Trends Plant Sci 13: 14–19[CrossRef][Web of Science][Medline] Gao G, Zhong Y, Guo A, Zhu Q, Tang W, Zheng W, Gu X, Wei L, Luo J (2006) DRTF: a database of rice transcription factors. Bioinformatics 22: 1286–1287 Guo J, Wu J, Ji Q, Wang C, Luo L, Yuan Y, Wang Y, Wang J (2008) Genome-wide analysis of heat shock transcription factor families in rice and Arabidopsis. J Genet Genomics 35: 105–118[CrossRef][Web of Science][Medline] Hawkins J, Boden M (2006) Detecting and sorting targeting peptides with neural networks and support vector machines. J Bioinform Comput Biol 4: 1–18[CrossRef][Medline] Heazlewood JL, Howell KA, Whelan J, Millar AH (2003) Towards an analysis of the rice mitochondrial proteome. Plant Physiol 132: 230–242 Holdsworth MJ, Bentsink L, Soppe WJ (2008a) Molecular networks regulating Arabidopsis seed maturation, after-ripening, dormancy and germination. New Phytol 179: 33–54[CrossRef][Web of Science][Medline] Holdsworth MJ, Finch-Savage WE, Grappin P, Job D (2008b) Post-genomics dissection of seed dormancy and germination. Trends Plant Sci 13: 7–13[CrossRef][Web of Science][Medline] Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai K (2007) WoLF PSORT: protein localization predictor. Nucleic Acids Res 35: W585–W587 Howell KA, Cheng K, Murcha MW, Jenkin LE, Millar AH, Whelan J (2007) Oxygen initiation of respiration and mitochondrial biogenesis in rice. J Biol Chem 282: 15619–15631 Howell KA, Millar AH, Whelan J (2006) Ordered assembly of mitochondria during rice germination begins with pro-mitochondrial structures rich in components of the protein import apparatus. Plant Mol Biol 60: 201–223[CrossRef][Web of Science][Medline] Jain M, Nijhawan A, Arora R, Agarwal P, Ray S, Sharma P, Kapoor S, Tyagi AK, Khurana JP (2007) F-box proteins in rice: genome-wide analysis, classification, temporal and spatial gene expression during panicle and seed development, and regulation by light and abiotic stress. Plant Physiol 143: 1467–1483 Jaiswal P, Ni J, Yap I, Ware D, Spooner W, Youens-Clark K, Ren L, Liang C, Zhao W, Ratnapu K, et al (2006) Gramene: a bird's eye view of cereal genomes. Nucleic Acids Res 34: D717–D723 Kleffmann T, von Zychlinski A, Russenberger D, Hirsch-Hoffmann M, Gehrig P, Gruissem W, Baginsky S (2007) Proteome dynamics during plastid differentiation in rice. Plant Physiol 143: 912–923 Kopka J, Schauer N, Krueger S, Birkemeyer C, Usadel B, Bergmuller E, Dormann P, Weckwerth W, Gibon Y, Stitt M, et al (2005) GMD@CSB.DB: the Golm Metabolome Database. Bioinformatics 21: 1635–1638 Lasanthi-Kudahettige R, Magneschi L, Loreti E, Gonzali S, Licausi F, Novi G, Beretta O, Vitulli F, Alpi A, Perata P (2007) Transcript profiling of the anoxic rice coleoptile. Plant Physiol 144: 218–231 Lazarova G, Zeng Y, Kermode AR (2002) Cloning and expression of an ABSCISIC ACID-INSENSITIVE 3 (ABI3) gene homologue of yellow-cedar (Chamaecyparis nootkatensis). J Exp Bot 53: 1219–1221 Li M, Xu W, Yang W, Kong Z, Xue Y (2007) Genome-wide gene expression profiling reveals conserved and novel molecular functions of the stigma in rice. Plant Physiol 144: 1797–1812 Malagnac F, Bartee L, Bender J (2002) An Arabidopsis SET domain protein required for maintenance but not establishment of DNA methylation. EMBO J 21: 6842–6852[CrossRef][Web of Science][Medline] Marchler-Bauer A, Anderson JB, Derbyshire MK, DeWeese-Scott C, Gonzales NR, Gwadz M, Hao L, He S, Hurwitz DI, Jackson JD, et al (2007) CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res 35: D237–D240 Nakabayashi K, Okamoto M, Koshiba T, Kamiya Y, Nambara E (2005) Genome-wide profiling of stored mRNA in Arabidopsis thaliana seed germination: epigenetic and genetic regulation of transcription in seed. Plant J 41: 697–709[CrossRef][Web of Science][Medline] Narsai R, Howell KA, Millar AH, O'Toole N, Small I, Whelan J (2007) Genome-wide analysis of mRNA decay rates and their determinants in Arabidopsis thaliana. Plant Cell 19: 3418–3436 Neuberger G, Maurer-Stroh S, Eisenhaber B, Hartig A, Eisenhaber F (2003) Motif refinement of the peroxisomal targeting signal 1 and evaluation of taxon-specific differences. J Mol Biol 328: 567–579[CrossRef][Web of Science][Medline] Newman TC, Ohme-Takagi M, Taylor CB, Green PJ (1993) DST sequences, highly conserved among plant SAUR genes, target reporter transcripts for rapid decay in tobacco. Plant Cell 5: 701–714 Ohme-Takagi M, Taylor CB, Newman TC, Green PJ (1993) The effect of sequences with high AU content on mRNA stability in tobacco. Proc Natl Acad Sci USA 90: 11811–11815 Palmieri L, Picault N, Arrigoni R, Besin E, Palmieri F, Hodges M (2008) Molecular identification of three Arabidopsis thaliana mitochondrial dicarboxylate carrier isoforms: organ distribution, bacterial expression, reconstitution into liposomes and functional characterization. Biochem J 410: 621–629[CrossRef][Web of Science][Medline] Prohl C, Pelzer W, Diekert K, Kmita H, Bedekovics T, Kispal G, Lill R (2001) The yeast mitochondrial carrier Leu5p and its human homologue Graves' disease protein are required for accumulation of coenzyme A in the matrix. Mol Cell Biol 21: 1089–1097 Rajjou L, Gallardo K, Debeaujon I, Vandekerckhove J, Job C, Job D (2004) The effect of alpha-amanitin on the Arabidopsis seed proteome highlights the distinct roles of stored and neosynthesized mRNAs during germination. Plant Physiol 134: 1598–1613 Remm M, Storm CE, Sonnhammer EL (2001) Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol 314: 1041–1052[CrossRef][Web of Science][Medline] Riano-Pachon DM, Ruzicic S, Dreyer I, Mueller-Roeber B (2007) PlnTFDB: an integrative plant transcription factor database. BMC Bioinformatics 8: 42[CrossRef][Medline] Ribot C, Hirsch J, Balzergue S, Tharreau D, Notteghem JL, Lebrun MH, Morel JB (2008) Susceptibility of rice to the blast fungus, Magnaporthe grisea. J Plant Physiol 165: 114–124[CrossRef][Web of Science][Medline] Roessner-Tunali U, Hegemann B, Lytovchenko A, Carrari F, Bruedigam C, Granot D, Fernie AR (2003) Metabolic profiling of transgenic tomato plants overexpressing hexokinase reveals that the influence of hexose phosphorylation diminishes during fruit development. Plant Physiol 133: 84–89 Schaffer AA, Aravind L, Madden TL, Shavirin S, Spouge JL, Wolf YI, Koonin EV, Altschul SF (2001) Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res 29: 2994–3005 Schauer N, Steinhauser D, Strelkov S, Schomburg D, Allison G, Moritz T, Lundgren K, Roessner-Tunali U, Forbes MG, Willmitzer L, et al (2005) GC-MS libraries for the rapid identification of metabolites in complex biological samples. FEBS Lett 579: 1332–1337[CrossRef][Web of Science][Medline] Schneider M, Bairoch A, Wu CH, Apweiler R (2005) Plant protein annotation in the UniProt Knowledgebase. Plant Physiol 138: 59–66 Schwacke R, Fischer K, Ketelsen B, Krupinska K, Krause K (2007) Comparative survey of plastid and mitochondrial targeting properties of transcription factors in Arabidopsis and rice. Mol Genet Genomics 277: 631–646[CrossRef][Web of Science][Medline] Small I, Peeters N, Legeai F, Lurin C (2004) Predotar: a tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 4: 1581–1590[CrossRef][Web of Science][Medline] Song CP, Agarwal M, Ohta M, Guo Y, Halfter U, Wang P, Zhu JK (2005) Role of an Arabidopsis AP2/EREBP-type transcriptional repressor in abscisic acid and drought stress responses. Plant Cell 17: 2384–2396 Sreenivasulu N, Usadel B, Winter A, Radchuk V, Scholz U, Stein N, Weschke W, Strickert M, Close TJ, Stitt M, et al (2008) Barley grain maturation and germination: metabolic pathway and regulatory network commonalities and differences highlighted by new MapMan/PageMan profiling tools. Plant Physiol 146: 1738–1758 Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, et al (2008) The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res 36: D1009–D1014 Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41[CrossRef][Medline] Thimm O, Blasing O, Gibon Y, Nagel A, Meyer S, Kruger P, Selbig J, Muller LA, Rhee SY, Stitt M (2004) MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J 37: 914–939[CrossRef][Web of Science][Medline] Thomas-Chollier M, Sand O, Turatsinze JV, Janky R, Defrance M, Vervisch E, Brohee S, van Helden J (2008) RSAT: regulatory sequence analysis tools. Nucleic Acids Res 36: W119–W127 Usadel B, Nagel A, Steinhauser D, Gibon Y, Blasing OE, Redestig H, Sreenivasulu N, Krall L, Hannah MA, Poree F, et al (2006) PageMan: an interactive ontology tool to generate, display, and annotate overview graphs for profiling experiments. BMC Bioinformatics 7: 535[CrossRef][Medline] Usadel B, Nagel A, Thimm O, Redestig H, Blaesing OE, Palacios-Rojas N, Selbig J, Hannemann J, Piques MC, Steinhauser D, et al (2005) Extension of the visualization tool MapMan to allow statistical analysis of arrays, display of corresponding genes, and comparison with known responses. Plant Physiol 138: 1195–1204 Wagner C, Sefkow M, Kopka J (2003) Construction and application of a mass spectral and retention time index database generated from plant GC/EI-TOF-MS metabolite profiles. Phytochemistry 62: 887–900[CrossRef][Web of Science][Medline] Walia H, Wilson C, Condamine P, Liu X, Ismail AM, Zeng L, Wanamaker SI, Mandal J, Xu J, Cui X, et al (2005) Comparative transcriptional profiling of two contrasting rice genotypes under salinity stress during the vegetative growth stage. Plant Physiol 139: 822–835 Walia H, Wilson C, Zeng L, Ismail AM, Condamine P, Close TJ (2007) Genome-wide transcriptional analysis of salinity stressed japonica and indica rice genotypes during panicle initiation stage. Plant Mol Biol 63: 609–623[CrossRef][Web of Science][Medline] Watson L, Henry RJ (2005) Microarray analysis of gene expression in germinating barley embryos (Hordeum vulgare L.). Funct Integr Genomics 5: 155–162[CrossRef][Medline] Wilson ID, Barker GL, Lu C, Coghill JA, Beswick RW, Lenton JR, Edwards KJ (2005) Alteration of the embryo transcriptome of hexaploid winter wheat (Triticum aestivum cv. Mercia) during maturation and germination. Funct Integr Genomics 5: 144–154[CrossRef][Medline] Winter D, Vinegar B, Nahal H, Ammar R, Wilson GV, Provart NJ (2007) An "electronic fluorescent pictograph" browser for exploring and analyzing large-scale biological data sets. PLoS One 2: e718[CrossRef][Medline] Xiao B, Wilson JR, Gamblin SJ (2003) SET domains and histone methylation. Curr Opin Struct Biol 13: 699–705[CrossRef][Web of Science][Medline] Yuan Q, Ouyang S, Wang A, Zhu W, Maiti R, Lin H, Hamilton J, Haas B, Sultana R, Cheung F, et al (2005) The Institute for Genomic Research Osa1 rice genome annotation database. Plant Physiol 138: 18–26 Zdobnov EM, Apweiler R (2001) InterProScan: an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17: 847–848 This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY® | THE PLANT CELL | |
|---|---|---|---|