Plant Physiol. Illumina
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


First published online May 8, 2008; 10.1104/pp.107.115535

Plant Physiology 147:1004-1016 (2008)
© 2008 American Society of Plant Biologists

OPEN ACCESS ARTICLE
This Article
Free via Open Access: OA
Right arrow OA Abstract
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrowOA All Versions of this Article:
147/3/1004    most recent
pp.107.115535v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Web of Science (1)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Srinivasasainagendra, V.
Right arrow Articles by Loraine, A. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Srinivasasainagendra, V.
Right arrow Articles by Loraine, A. E.
Agricola
Right arrow Articles by Srinivasasainagendra, V.
Right arrow Articles by Loraine, A. E.
BIOINFORMATICS

CressExpress: A Tool For Large-Scale Mining of Expression Data from Arabidopsis1,[W],[OA]

Vinodh Srinivasasainagendra, Grier P. Page, Tapan Mehta, Issa Coulibaly and Ann E. Loraine*

Section on Statistical Genetics, Department of Biostatistics, University of Alabama, Birmingham, Alabama 35294 (V.S., G.P.P., T.M, I.C.); and University of North Carolina at Charlotte, Bioinformatics Research Center, North Carolina Research Campus, Kannapolis, North Carolina 28081 (A.E.L.)


    ABSTRACT
 TOP
 ABSTRACT
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 LITERATURE CITED
 
CressExpress is a user-friendly, online, coexpression analysis tool for Arabidopsis (Arabidopsis thaliana) microarray expression data that computes patterns of correlated expression between user-entered query genes and the rest of the genes in the genome. Unlike other coexpression tools, CressExpress allows characterization of tissue-specific coexpression networks through user-driven filtering of input data based on sample tissue type. CressExpress also performs pathway-level coexpression analysis on each set of query genes, identifying and ranking genes based on their common connections with two or more query genes. This allows identification of novel candidates for involvement in common processes and functions represented by the query group. Users launch experiments using an easy-to-use Web-based interface and then receive the full complement of results, along with a record of tool settings and parameters, via an e-mail link to the CressExpress Web site. Data sets featured in CressExpress are strictly versioned and include expression data from MAS5, GCRMA, and RMA array processing algorithms. To demonstrate applications for CressExpress, we present coexpression analyses of cellulose synthase genes, indolic glucosinolate biosynthesis, and flowering. We show that subselecting sample types produces a richer network for genes involved in flowering in Arabidopsis. CressExpress provides direct access to expression values via an easy-to-use URL-based Web service, allowing users to determine quickly if their query genes are coexpressed with each other and likely to yield informative pathway-level coexpression results. The tool is available at http://www.cressexpress.org.


Availability of abundant, high-quality data sets from microarray expression experiments has stimulated rapid progress in gene networks analysis for a variety of plant and animal species (Stuart et al., 2003Go; Craigon et al., 2004Go; Wille et al., 2004Go; Wei et al., 2006Go; Zhong and Sternberg, 2006Go). These data are making it possible to explore correlated expression patterns for the entire genome, as well as answer focused questions regarding specific pathways and processes. By examining correlated expression patterns between genes, investigators can infer new functions for previously uncharacterized genes or identify potential causal relationships between regulators and their targets. Although the details of individual analyses and applications vary, most are based on the idea that correlated expression, or coexpression, implies biologically relevant relationships between gene products.

Many applications of this idea utilize variations of Pearson's correlation coefficient and linear regression to quantify coexpression relationships. Figure 1 presents an example scatter plot that illustrates the idea. Each point on the plot represents data from one array; x and y coordinates represent expression values for genes indicated on the horizontal and vertical axes, respectively. In this case, there is a strong positive relationship between the two genes' expression values; when one gene's expression is high, the other gene's expression is also high. Computing a linear regression between expression values for the two genes quantifies the strength of this relationship. This yields an r2 value, equivalent to the square of Pearson's correlation coefficient, and a P value that expresses the probability of obtaining the observed r2 value (or larger) assuming a random relationship between the two variables. The regression also yields a slope, which indicates the direction of the coexpression relationship. Larger r2 and smaller P values signal higher confidence in the coexpression relationship.


Figure 1
View larger version (21K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 1. Two highly coexpressed genes from Arabidopsis. RMA-normalized expression values for AtST5a (AT1G74100; x axis) and CYP83B1 (y axis) are plotted relative to each other. Each point indicates expression values from a single array. Regressing expression values for CYP83B1 (y axis) against expression values for AtST5a (x axis) yields an r2 value of approximately 0.7, equivalent to Pearson's correlation coefficient of 0.85. The regression line appears as a dashed line on the plot.

 
Altogether, these numbers summarize how closely two genes are coexpressed across multiple array experiments and provide a way for experimenters to identify and quantify relationships between genes. When these numbers are available for a large number of genes, they can be used to build coexpression graphs, or networks, in which highly correlated genes (nodes) are linked and less well-correlated genes are not. Studies analyzing coexpression networks have demonstrated that genes connected in the coexpression network often perform related functions, thus demonstrating biological relevance of the approach (for review, see Aoki et al., 2007Go; Saito et al., 2008Go).

A number of groups have established Web-based interfaces for mining precomputed coexpression results and coexpression networks in Arabidopsis (Arabidopsis thaliana). To our knowledge, none offers ways to recompute the coexpression networks using subsets of experiments and arrays, perhaps because of technical challenges involved. Recomputing correlation between a query gene and all the genes in the genome is computationally intensive and cannot easily be done in real-time during a user's visit to a Web site. However, the coexpression networks arising from different inputs may vary greatly, depending on sample or tissue type. We address this problem by developing an easy-to-use Web tool (CressExpress) that allows users to select distinct tissue types and experiments to include in an analysis, which then executes the calculations offline. When the analysis finishes, users receive an e-mail linking to a zip file on the CressExpress site that contains a complete package of results, along with a record of all parameters and samples used as inputs to the experiment. By providing a complete report of results and inputs, CressExpress makes it convenient for users to integrate coexpression analysis into their research workflow.


    RESULTS
 TOP
 ABSTRACT
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 LITERATURE CITED
 
CressExpress is an easy-to-use Web-based tool that allows researchers to set up and run coexpression analysis experiments using a variety of different data sets and sample types. To set up and run an experiment, users enter query identifiers and analysis parameters on the CressExpress Web site located at http://www.cressexpress.org. To begin an analysis, users click the "Run the Tool" link and then proceed through a series of screens (Fig. 2 ) that offer users the opportunity to vary quality control (QC) parameters, specify data release and array platforms, and select subsets of sample types to include in analysis. This latter feature can be particularly important for query genes that exert their effects in a tissue- or developmentally restricted fashion. At each step, a "help" icon links to a page describing the various options and how they would likely affect the analysis results. CressExpress also provides reasonable default choices so that users can easily perform pilot studies and quickly learn how the tool operates.


Figure 2
View larger version (57K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 2. CressExpress operation. Screen captures showing step-by-step operation of the coexpression tool are shown.

 
In step one, users choose a data release of expression values that will be used in the analysis. Currently, there are four data release options, each one representing different array collections and array processing methods (Table I ). Each release contains expression data harvested from the Nottingham Arabidopsis Stock Center (NASC) AffyWatch subscription service (Craigon et al., 2004Go) and includes samples from two Affymetrix Arabidopsis array designs: the ATH1 array (22,810 probe sets) and the AG array (approximately 8,000 probe sets). Release 2.0 provides the same data used in a previously published analysis of metabolic pathways; we provide this data set as a courtesy for users interested in investigating the prior study's results (Wei et al., 2006Go). Releases 3.0, 3.1, and 3.2 are the same set of arrays, but the expression values in each were generated using different processing methods. We provide data from different array processing methods mainly for users who want to compare results with other online tools or investigate how these methods may affect downstream coexpression analysis results. However, we generally recommend using releases 2.0 or 3.0, which were generated using the RMA algorithm (Irizarry et al., 2003Go). We recommend these releases mainly because we have observed good separation between correlation coefficients for probe sets expected to be correlated (e.g. redundant probe set pairs; Cui and Loraine, 2006Go) versus probe set pairs selected at random from the genome (C. Sherrill and A.E. Loraine, unpublished data).


View this table:
[in this window]
[in a new window]

 
Table I. CressExpress data releases

 
Step one also features an option to configure a QC setting for individual arrays. This QC setting is based on a Kolmogorov-Smirnov (KS) test of deleted residuals, which is described in detail elsewhere (Persson et al., 2005Go; Trivedi et al., 2005Go), but we also describe it briefly here. The KS-D test statistic ranges from 0 to 1 and quantifies how much a given array's expression values deviate from other arrays in the same group. Arrays sharing the same NASC experiment identifier are considered as belonging to a single group. Arrays with larger KS-D test statistics are of lower quality, at least with respect to how well they resemble other arrays in the same experiment. Decreasing the KS-D threshold excludes outlier arrays that are more variable with respect to the other arrays in the same experimental grouping. Eliminating these lower-quality, outlier arrays can therefore increase observed correlation between genes that are coexpressed. We recommend that when running the tool with a relatively small number of arrays (e.g. <50), users should utilize the default KS-D value of 0.15, because computing coexpression with a smaller number of arrays makes the regression more vulnerable to outliers that can skew results. Coexpression analysis experiments involving more arrays will be less vulnerable to outlier arrays, and eliminating the outlier arrays will have less effect. Figure 3 presents the distribution of KS-D statistics computed for each data release. Note that the release 2.0 arrays were prescreened to exclude arrays with KS-D > 0.15 as described by Wei et al. (2006)Go.


Figure 3
View larger version (29K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 3. Distribution of KS-D test statistics for CressExpress data releases. KS-D statistics for release 2.0 (A), 3.0 (B), 3.1 (C), and 3.2 (D) are shown.

 
In step two, users enter a list of up to 50 queries, using Arabidopsis Genome Initiative (AGI) gene names or probe set names to identify genes. To map AGI gene names onto probe set names, CressExpress uses the probe set-to-gene id annotations provided by The Arabidopsis Information Resource (TAIR). However, because these mappings are problematic in some cases (Cui and Loraine, 2006Go), CressExpress always reports results using both probe set identifiers and AGI codes. Users also use this screen to specify the array type; options include the ATH1 (22,810 probe sets; Redman et al., 2004Go) or the older AG (approximately 8,000 probe sets) array, both from Affymetrix. By default, the ATH1 array is selected, because more recent and greater amounts of data are available for the ATH1 relative to the AG array. For each query gene, the CressExpress server will perform a large-scale linear regression experiment, comparing each query to all genes represented on the selected array, using all or just some expression data stored in the CressExpress database, depending on the sample and experiments selected in subsequent steps.

In steps three and four, users choose sample types (step three) and experiments (step four) to include in the regression analysis for their query genes. In step three, CressExpress builds a list of sample types for arrays that met the QC and array type criteria specified in previous steps. Users can select all sample types (the default) or subsets of sample types from a menu listing, where the wording for each item on the list comes from the text provided by NASC. Because NASC obtains the textual description of sample types from experimenters who generated and submitted the original data, the sample type descriptions may use a variety of different terms meaning the same thing. Therefore, we advise users to read the entire list when attempting to limit their analyses to specific tissue types, because different text may have similar meanings. For example, users wishing to examine coexpression networks of flowering organs might select sample types labeled "flowers" and "flower buds" as well as "inflorescence."

The default option for steps three and four is to use all available sample types. This default setting is mainly for the convenience of users wishing to run quick pilot experiments and become familiar with the tool and how it operates. To take full advantage of the CressExpress tool, we recommend that users choose sample types where the query gene products are expected to be expressed or active. For example, users interested in investigating the coexpression network surrounding genes involved in flowering should choose sample types that include flowers. Similarly, users interested in investigating pathways involved in photosynthesis should choose sample types derived from green shoots and leaves.

To demonstrate the effects of sample type filtering, Figure 4 presents a diagram showing output from a coexpression experiment in which 185 flowering-related genes were compared to each other. Each connection in the network represents an r2 value ≥ 0.64, such that each pair of connected genes exhibit expression correlation >0.8 or <–0.8. Figure 4A shows the network as computed using all 1,771 arrays from CressExpress data release 3.0, while Figure 4B shows the network computed using a subset of 129 arrays that were labeled as being from flower- or pollen-related samples. The flower-based network shown in Figure 4B contains many more connections than in Figure 4A, demonstrating that, in this case, restricting inputs to biologically relevant sample types yields a richer, more informative network. For example, one of the best-connected genes in Figure 4B is AT3G18990 (VRN1), which encodes a DNA-binding protein involved in vernalization, the process by which exposure to cold temperatures helps to trigger flowering in Arabidopsis (Chandler et al., 1996Go; Levy et al., 2002Go). This gene is absent from the network generated from all available samples. This means that if one were entirely unaware that VRN1 plays a role in flowering, an analysis of coexpression would reveal its role in flowering thanks to the many connections VRN1 shares with flowering-related genes, but an analysis of the network generated from all available samples would not. However, we have also observed that in many instances, reducing the number of samples can inflate the number of connections, even among groups of genes that one would not normally consider to be coexpressed. To control for this, we recomputed the flowering network 100 times, using different randomly selected subsets of 129 arrays. Not once did we observe a network as richly connected as the one computed from flowering samples alone, suggesting that, in this case, using flowering-related samples exposes coexpression relationships that would otherwise be masked when all available data are used.


Figure 4
View larger version (59K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 4. Coexpression network for flowering-related genes without (A) and with (B) sample-type filtering. In B, only samples derived from flowers were included in the analysis.

 
In step five, users may configure a pathway-level coexpression (PLC) analysis, which identifies genes that are coexpressed with multiple query genes. Seen from the point of view of the larger coexpression network, PLC analysis identifies query genes' common neighbors, such that each neighbor is connected to two or more members of the query group. We previously used PLC analysis to identify candidate genes involved in metabolic pathways and cell wall biosynthesis, and, in general, we have found that genes identified as coexpressed not just with a single gene but with ensembles of genes acting together are often the best and more successful candidates to test (Persson et al., 2005Go; Wei et al., 2006Go). Running a PLC analysis requires the user to have entered two or more query genes in step two and also requires the user to specify a linear regression r2 cutoff parameter for designating coexpression between gene pairs, where the r2 value is taken from the linear regression performed during the initial analysis. However, a default value of 0.36, equivalent to a correlation coefficient (Pearson's r) of 0.6, is provided for convenience.

Interpreting and using the PLC functionality and its results requires understanding how the PLC algorithm operates, and so we describe it in detail here. The PLC method as implemented in CressExpress operates as described previously, with some differences in how results are ranked (Wei et al., 2006Go). The PLC algorithm examines the coexpression results for each of the user-supplied query genes and then builds a list of candidate genes that are coexpressed with two or more members of the query group, where coexpression is defined as an r2 regression result greater than the user-specified threshold. Genes coexpressed with two or more members of the query gene group are ranked first according to the number of genes within the query group with which they are coexpressed, and second by the average r2 value. Thus, genes that are coexpressed with many members of the query group have a higher rank than genes coexpressed with fewer members of the query group.

The PLC method is most useful and relevant when at least two of the query genes are coexpressed with each other at the given PLC r2 threshold. One way to determine the best cutoff for determining coexpression in PLC is to examine the correlations between the query genes that are expected ahead of time to be coexpressed and then choose an r2 threshold that is lower than the smallest pairwise r2 between them. If this is done, then the CressExpress PLC will recover all genes that are coexpressed with the queries at least as well as any two query genes are coexpressed with each other. To find out the threshold r2 between queries, users can run the tool once, examine the list of coexpressed genes for each query individually to find out the cross-query r2 values, and then rerun the tool with a new PLC r2 threshold that is smaller than the lowest cross-query pair. Another method for finding the correlations between queries would be to use the CressExpress expression data direct access method in conjunction with a statistical analysis environment like R (R Development Core Team, 2008Go) or TableView (Johnson et al., 2003Go), described in detail below. Regardless of the precise r2 threshold chosen, the PLC analysis has the most meaning when at least two or more of the query gene products are expected to be coexpressed, because they act in concert either in the same pathway or as constituents of a protein complex.

The final step (step six) asks the user to enter a comma-separated list of one or more e-mail addresses that will receive e-mails reporting on the status of the analysis. Upon successful completion of the analysis, each address receives a "Job Completion" message containing a link to a compressed, archive folder (a zip file) stored temporarily on the CressExpress server. The zip file contains the full complement of analysis results as well as a record of all parameters used in the experiment (Table II ). CressExpress generates results files (in csv format) for each query gene, in which regression results comparing the query gene to all probe sets on the selected array are reported. The spreadsheet files are named after the query gene's matching probe set and can be loaded directly into Excel or any other program capable of reading comma-separated format files. Each row of data represents the results from a single linear regression comparing the query gene against another gene on the designated array and includes the linear regression P value, r2 value, and slope, along with the brief description of the target gene. These descriptions come from TAIR and are provided for the convenience of users as they scan results searching for interesting patterns in the types of genes that are most highly coexpressed with their queries.


View this table:
[in this window]
[in a new window]

 
Table II. Data output files

When an analysis completes, users receive an e-mail message reporting a URL where they can download a zipped folder containing several files with results and descriptions of the analysis. Several of these are listed.

 
The PLC results files include csv files reporting coexpressed neighbor genes and corresponding Web pages with hyperlinks to TAIR, allowing for rapid review of results. In addition, the PLC analysis generates a network file (coexp.sif) together with companion node and edge attribute files suitable for loading and visualization in Cytoscape, a popular network analysis and visualization tool (Shannon et al., 2003Go). Users can configure Cytoscape to utilize a custom styles file packaged with the results files. Directions on how to view the network file using Cytoscape appear in the FAQ section of the CressExpress Web site. Figure 5 presents an example of a Cytoscape visualization showing PLC results for six CESA (cellulose synthase) genes from Arabidopsis. This type of visualization is particularly useful when query genes resolve into separate networks, as with the six CESA genes presented in the figure. As can be seen from Figure 5, CESA1, CESA3, and CESA6, which predominate in primary cell wall formation, are connected to a different set of genes than are CESA4, CESA7, and CESA8, which predominate in secondary cell wall biosynthesis. In this case, the coexpression connections for the six genes appear to mirror the distinct functions of the secondary and primary cell wall genes encoding cellulose synthase subunits.


Figure 5
View larger version (71K):
[in this window]
[in a new window]
[as a PowerPoint slide]
 
Figure 5. Cytoscape visualization depicting PLC results for CESA genes involved in primary(A) and secondary (B) cell wall biosynthesis.

 
Because CressExpress distributes data in bulk as one zip file, users typically find it relatively easy to track and store results from CressExpress analysis runs using their preferred electronic data archiving scheme. For example, users who incorporate results from CressExpress in published work might prefer to distribute the original zip file as part of supplementary data files on journal or lab Web sites. The CressExpress design philosophy is that each run of the CressExpress tool is a self-contained experiment and should not only be easy to repeat but also should be easy to record and incorporate into users' research workflow. By providing complete records of results and experimental parameters, CressExpress aims to make it easier for users to manage and track their in silico coexpression experiments.


CressExpress Direct Access to Expression Values

CressExpress offers direct access to precomputed expression data via a simple URL-based method in which users access expression values for specific probe sets by encoding the requests as Web addresses. Using the direct access approach, users can retrieve expression values for individual genes and arrays, save the values to local files, or import them into directly into Web-enabled desktop programs like R or TableView (Johnson et al., 2003Go). This feature of CressExpress is useful for advanced users who wish to perform their own custom analyses and therefore need direct access to the underlying raw data used by the CressExpress tool in the large-scale coexpression analysis. On the CressExpress Web site, the tabbed panel labeled "visualization" links to a tutorial explaining how to use the direct access method to import expression data into the TableView program, a user-friendly, freely-available desktop visualization tool implemented in Java. The tutorial describes how users can identify "outlier" arrays that signal potentially meaningful deviations from the coexpression patterns between genes, identify sets of samples that yield unusually high or low expression values for a given gene, or compare the relative variability of similar sample types from different experiments. For more advanced users, the "web services" link provides an example of a short R script showing how to import the data into the R statistical programming environment and compute Pearson's correlation coefficient between query genes from the glucosinolates biosynthesis pathway. We also provide several BioMoby services for programmers to incorporate CressExpress functionality into their applications (Wilkinson et al., 2005Go). We do not describe these here but instead refer readers to the CressExpress "Web Services" section, which documents the BioMoby services and their uses.


Biological Application of the Coexpression Tool: Example Analyses

Example Analysis: Cellulose Synthase Enzymes and Cell Wall Biosynthesis
The Arabidopsis genome contains several CESA genes encoding putative and known components of multisubunit cellulose synthase complexes responsible for primary and secondary wall formation. Previously, we described a coexpression-based analysis that identified genes that were coexpressed with all or some of the CESA genes, and subsequent analysis of mutant phenotypes for some of these genes confirmed their role in cell wall formation and/or stability (Persson et al., 2005Go). Here, we demonstrate how researchers can use the coexpression tool to recapitulate and extend the analysis, using new releases of expression data featured as part of the CressExpress tool.

Following the procedures described above, we instigated a CressExpress experiment using the six primary and secondary cell wall genes (Table III ) as queries and a PLC r2 cutoff 0.36. Figure 5 shows a screen capture from the Cytoscape network visualization tool depicting the six query genes and their PLC-identified neighbors. We find that the secondary and primary cell wall CESA genes are coexpressed with different, nonoverlapping groups. Of the genes linked with secondary cell wall CESA genes (Fig. 5B), at least 17 have been investigated experimentally and found to exhibit secondary cell wall-related phenotypes and functions (Table IV ), while many more are annotated as having predicted functions related to carbohydrate synthesis or cell wall functions (Supplemental Data S1). This example illustrates how one might use CressExpress to tease apart the different functions of genes that share considerable similarity at the sequence level but which may play distinct roles in the plant body. In this case, it was already known that CESA4, CESA7, and CESA8 perform a different role from CESA1, CESA3, and CESA6, and we found that the coexpression analysis tends to confirm this view, because the two groups are coexpressed with nonoverlapping groups of genes, as determined by the PLC analysis. The same argument may be made for other closely related genes, potentially yielding new hypotheses regarding gene function even for members of closely related gene families.


View this table:
[in this window]
[in a new window]

 
Table III. Probe set-to-target gene mappings for genes encoding cellulose synthase enzymes involved in secondary and primary cell wall formation

Probe set to gene mappings are from TAIR.

 

View this table:
[in this window]
[in a new window]

 
Table IV. PLC analysis results for secondary cell wall biosynthesis genes CESA4, CESA7, and CESA8

Probe set identifiers and their corresponding target genes (AGI identifiers) that are highly correlated with two or more genes involved in secondary cell wall formation are listed. Column heading r2 indicates the mean of the r2 values obtained from regressing the gene in column 1 against each of the three CESA genes. Column 5 (Phenotype) indicates whether the studies referenced in column six (Refs.) determined that the gene in column 1 possessed a secondary cell wall-related phenotype. The terms "mild" and "severe" indicate degree of phenotypic severity, n/a indicates no published studies were found, and no/yes indicates conflicting reports. Numbers in the column six refer to the following publications: 1, Brown et al. (2005)Go; 2, Ko et al. (2006)Go; 3, Sawa et al. (2005)Go; and 4, Ubeda-Tomas et al. (2007)Go.

 
Example Analysis: Glucosinolate Biosynthesis from Trp
Glucosinolates are nitrogen- and sulfur-containing secondary metabolites that are derived from several different amino acids in plants, including Arabidopsis and many agriculturally important Brassicaceae species (Grubb and Abel, 2006Go; Halkier and Gershenzon, 2006Go). Glucosinolates undergo conversion to toxic or otherwise bioactive breakdown products through the action of β-thioglucosidase enzymes called myrosinases that only come into contact with their glucosinolate substrates when cells are damaged and cellular contents mix. This mechanism is termed the "mustard oil bomb" and contributes to the plant's ability to resist pathogen attack and herbivory. Glucosinolate breakdown products provide the characteristically pungent tastes of horseradish and wasabi, and at least one has been shown to have anti-cancer properties. Still others are toxic to animals in high doses. Because of their clear importance in plant defense as well as nutrition and human health, glucosinolate biosynthesis has been studied intensively in Arabidopsis and related species, and many of the genes required in glucosinolate synthesis have been identified. Hirai et al. (2007)Go used a coexpression-based approach to identify Myb-family transcription factors required for biosynthesis of Cys-derived glucosinolates. Similarly, Gachon et al. (2005)Go used hierarchical clustering of expression profiles to expose patterns of correlated expression among genes involved in synthesis of glucosinolates from Trp. Coexpression analysis of glucosinolate biosynthesis in Arabidopsis therefore provides an excellent example application for the CressExpress tool.

We used the AraCyc database hosted at the TAIR Web site to look up AGI codes for genes associated with the indolic glucosinolates pathway (Zhang et al., 2005Go). The pathway as annotated in AraCyc includes five reactions catalyzed by six gene products (Table V ). We used CressExpress to determine the degree to which these genes are coexpressed with each other. Using CressExpress tool default parameters, we performed a coexpression analysis of all six genes, comparing them both to each other as well as to all other genes represented on the ATH1 array. This initial pilot study revealed that all six genes are highly coexpressed with each other. The lowest r2 we obtained for any pair of genes in the pathway was 0.37, obtained for SUR1 compared with CYP79B2. We therefore ran CressExpress a second time, using identical parameters but with a PLC r2 threshold of 0.35 to capture both the coexpression network involving the six biosynthetic genes as well as any other genes surrounding them that are coexpressed to the same or slightly less degree as the query genes.


View this table:
[in this window]
[in a new window]

 
Table V. Genes involved in glucosinolate biosynthesis from Trp as reported in AraCyc version 3.5

Values in the column labeled "Other Names" are from TAIR. Names marked with * are from a review of glucosinolate biosynthesis (Grubb and Abel, 2006Go). Probe set to gene mappings are from TAIR.

 
The PLC analysis revealed 155 different genes that are coexpressed (at r2 ≥ 0.35) with two or more of the glucosinolate pathway genes; of these, seven are coexpressed with all six. We then used the BioMoby literature aggregator tool (LitRep, http://mips.gsf.de/proj/planet/araws/litRepSearch.html) to search for articles referencing this and the other top seven coexpressed genes (Table VI ). One gene (AT1G18590) has already been shown to play a role in glucosinolate biosynthesis (Piotrowski et al., 2004Go), and annotations on the others (see Table VI) suggest they are reasonable candidates for involvement in this pathway. Testing their role in glucosinolate biosynthesis or plant defense is beyond the scope of this article, but clearly these genes would serve as good candidates for experimental investigation based on their strong coexpression with each annotated gene in the pathway.


View this table:
[in this window]
[in a new window]

 
Table VI. Results from PLC analysis of genes involved in biosynthesis of glucosinolates from Trp

Several genes with unknown functions or functions related to sulfur metabolism and Trp biosynthesis exhibit strong coexpression with all six genes from the glucosinolate biosynthesis pathway, as annotated in AraCyc 3.5. The full list of genes identified via PLC appears in Supplemental Data S2. References cited were collected using LitRep, a BioMoby literature aggregator search tool that queries the Aramemnon, ATIDB, TAIR, and MPIMP databases (http://mips.gsf.de/proj/planet/araws/litRepSearch.html).

 

    DISCUSSION
 TOP
 ABSTRACT
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 LITERATURE CITED
 
We developed CressExpress, an easy-to-use, freely available Web-based tool that allows users to run coexpression analysis experiments using biologically relevant subsets of expression data harvested from public domain Affymetrix Arabidopsis expression microarrays. For each experiment, the tool recomputes coexpression relationships by performing a linear regression comparing each user-entered query gene to all other genes represented in the expression data, using a user-selected subset of experiments in the database. Computing the linear regressions typically requires several minutes, and so individual analyses are performed offline as distinct jobs. All analysis software executes on a server remote from the user's desktop, and users set up experiments using their Web browser by proceeding through a series of steps in which they enter query genes, choose QC parameters, and specify sample types to include in the analysis. At each point where a choice must be made, CressExpress provides links to help pages describing how the different parameters may affect results and also supplies reasonable defaults for users wanting to run preliminary pilot experiments. Thus, to operate the tool, users need only be able to operate a Web browser and have access to an e-mail account. When the experiment job completes, users receive an e-mail containing a link to a compressed zip file on the server containing results and output files.

Several groups have developed online, Web-based analysis tools that offer a variety of methods and approaches for analyzing and visualizing publicly available Arabidopsis expression data (Zimmermann et al., 2004Go; Manfield et al., 2006Go; Obayashi et al., 2007Go). However, these tools are based on precomputed correlations stored in a database and do not allows users to specify experiments or sample types to include in the calculations. This makes it possible to compare how the observed coexpression network changes when different arrays are included in an analysis. In addition, CressExpress explicitly tracks data releases, allowing users to experiment with different parameter settings, such as different array types, without concern that the underlying database has changed between experiments. To our knowledge, none of the other coexpression data-mining systems provides either the ability to specify sample types to include in analysis or provides detailed information about analysis inputs and parameters.

Another distinctive feature of CressExpress is that it provides comprehensive analysis results in formats aimed at facilitating downstream data-mining and analysis. After a run of the tool, users receive a set of simple, comma-separated plain text file for each query gene. Each of these files lists the r2, slope, and P values for each linear regression between the query and all possible target genes, along with a short textual description of each target to aid users as they explore the results. In general, CressExpress aims to provide data in ways that allow researchers to visualize and explore results using desktop visualization programs and analysis tools, such as Excel, TableView (Johnson et al., 2003Go), and Cytoscape (Shannon et al., 2003Go). CressExpress also provides a direct access method allowing users to retrieve raw expressed data directly from the CressExpress database using a simple, URL-based scheme. Using the direct access method, users can build spreadsheets with expression values for every array in a designated data release or import expression values directly into desktop visualization and statistical analysis programs like R and TableView. By providing this relatively straightforward method of accessing data along with tutorials explaining how to use it with R and TableView, CressExpress aims to give users of varying computational experience the opportunity to experiment with computational methods in their research and extract new value from published, publicly available expression microarray data.

In the future, we plan to add several new features to CressExpress, focusing on three major goals: facilitating comparative coexpression analysis across species, making sample selection easier, and repackaging the software to allow easy deployment at other sites. To streamline array selection (step three), we plan to add a feature that will let users select samples based on their Plant Ontology term annotations as they become available (Avraham et al., 2008Go). We also would like to support comparisons and candidate gene prediction across species and array types. Toward this end, we developed a prototype tool (see www.ssg.uab.edu/ccpmt) that matches probes and probe sets from different arrays based on target gene similarity. Currently, the tool supports six Arabidopsis arrays and a poplar (Populus spp.) genome array from Affymetrix. We would also like to make it easy for other groups to mirror the CressExpress software on their own sites, thus allowing them to reuse the software with their own custom data sets. In hopes of recruiting community interest in this latter effort, we created a developer's source code release of the software on the CressExpress Web site under the "Data Mining Code" link. However, it is important to note that CressExpress uses a commercial Java statistics library from Visual Numerics, which charges a licensing fee. Although we feel that the convenience of having a robust library of statistical routines is well worth the price, we may ultimately replace this non-free component with a version from the open source community. Another modification we are considering involves increasing the current limit of 50 queries per CressExpress analysis, depending on community interest. Readers who would like to suggest other new features or changes to the tool behavior are welcome to contribute ideas and requests.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 LITERATURE CITED
 

Array Informatics

CD media containing expression data were obtained from the NASC AffyWatch subscription service. Each CD contained CEL files with "raw" expression data, grouped into folders named for the investigator who contributed the data.

Upon receipt of each CD, XML files describing each experiment were harvested from the NASC site, using the "passthru" parameter as described on the Web page located at http://arabidopsis.info/bioinformatics/narraysxml/index.html. Slide names associated with each experiment id were harvested from the XML files by extracting the content of "NASC:Name" tags. For about one-half of the CEL files, we were able to use CEL file names and slide names to connect slides with their corresponding CEL files and thus capture in our database the experimental group affiliation for each CEL file. For the remainder, NASC supplied (via e-mail) Excel spreadsheets that report the correspondence between CEL file names and NASC slide names. The mapping between slide name and CEL file names were also manually checked; cases where the names seemed to disagree or contradict each other were resolved via e-mail correspondence with NASC. The mapping is supplied as part of the direct access method.


Array Processing

CEL file processing steps, including background correction, probe set summarization, and normalization procedures, were done using functions in the Bioconductor "affy" package suite of tools (Gautier et al., 2004Go). For release 2.0, arrays were processed using the RMA algorithm as described previously (Wei et al., 2006Go). Each of the 3.x releases included the same set of arrays, including arrays from Affywatch series I, II, and III, but were processed using different affy package methods and algorithms. For release 3.0, arrays were processed using the justRMA method with default settings. For release 3.1, arrays were processed using the justGCRMA algorithm, again with default parameters. Releases 3.0 and 3.1 were generated using an Altix computer at the Alabama Supercomputing Center. Release 3.2 was generated using the MAS5 algorithm, followed by a scaling step in which expression values for each probe set on the same array were divided by the average probe set value for that array. Following the scaling step, the log (base 2) was taken. Processing for release 3.2 was done on a computer equipped with two 2-GHz AMD Opteron 246 CPUs and 16 Gb of RAM running the Linux operating system.


Coexpression Analysis and QC Procedures

The coexpression tool is based on large-scale linear regression analysis of expression values between genes of interest and the rest of the genes on a selected array using the methodology described previously (Persson et al., 2005Go; Wei et al., 2006Go). For each data release, we computed the KS-D statistics for each array, except in cases where there were fewer than three arrays per experiment. Detailed descriptions of the KS-D computation have been previously reported (Persson et al., 2005Go; Trivedi et al., 2005Go).


Supplemental Data

The following materials are available in the online version of this article.

Supplemental Data S1. Genes coexpressed with CESA4, CESA7, and CESA8.
Supplemental Data S2. Web page reporting all genes identified by PLC (r2 threshold 0.35) as coexpressed with the glucosinolate biosynthesis pathway.


    ACKNOWLEDGMENTS
 
The authors thank the Alabama Supercomputer Center for providing computational assistance with array processing steps, Jim Johnson for help with the TableView software, the School of Public Health MITS team for computer systems support, and Mikako Kawai for Web page and graphic design. We are particularly grateful for the staff of the NASC for their generosity and help with matching slide and CEL file names. Last, we thank the anonymous reviewers for their excellent comments on the manuscript.

Received January 18, 2008; accepted April 29, 2008; published May 8, 2008.


    FOOTNOTES
 
1 This work was supported by the University of Alabama Center for Nutrient-Gene Interaction, by the National Institutes of Health National Cancer Institute (grant no. U54CA100949), and by the National Science Foundation (grant no. 061012 and Plant Genome award nos. 0217651 and 0501890). Back

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Ann E. Loraine (aloraine{at}uncc.edu).

[W] The online version of this article contains Web-only data. Back

[OA] Open Access articles can be viewed online without a subscription. Back

www.plantphysiol.org/cgi/doi/10.1104/pp.107.115535

* Corresponding author; e-mail aloraine{at}uncc.edu.


    LITERATURE CITED
 TOP
 ABSTRACT
 RESULTS
 DISCUSSION
 MATERIALS AND METHODS
 LITERATURE CITED
 
Aoki K, Ogata Y, Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biology. Plant Cell Physiol 48: 381–390[Abstract/Free Full Text]

Avraham S, Tung CW, Ilic K, Jaiswal P, Kellogg EA, McCouch S, Pujar A, Reiser L, Rhee SY, Sachs MM, et al (2008) The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations. Nucleic Acids Res 36: D449–D454[Abstract/Free Full Text]

Bechtold U, Murphy DJ, Mullineaux PM (2004) Arabidopsis peptide methionine sulfoxide reductase2 prevents cellular oxidative damage in long nights. Plant Cell 16: 908–919[Abstract/Free Full Text]

Bender J, Fink GR (1998) A Myb homologue, ATR1, activates tryptophan gene expression in Arabidopsis. Proc Natl Acad Sci USA 95: 5655–5660[Abstract/Free Full Text]

Brown DM, Zeef LA, Ellis J, Goodacre R, Turner SR (2005) Identification of novel genes in Arabidopsis involved in secondary cell wall formation using expression profiling and reverse genetics. Plant Cell 17: 2281–2295[Abstract/Free Full Text]

Chandler J, Wilson A, Dean C (1996) Arabidopsis mutants showing an altered response to vernalization. Plant J 10: 637–644[CrossRef][Web of Science][Medline]

Craigon DJ, James N, Okyere J, Higgins J, Jotham J, May S (2004) NASCArrays: a repository for microarray data generated by NASC's transcriptomics service. Nucleic Acids Res 32: D575–D577[Abstract/Free Full Text]

Cui X, Loraine AE (2006) Global correlation analysis between redundant probe sets using a large collection of Arabidopsis ATH1 expression profiling data. In LSS Computational Systems Bioinformatics. World Scientific, Stanford, CA, pp 223–227

Ekman DR, Lorenz WW, Przybyla AE, Wolfe NL, Dean JF (2003) SAGE analysis of transcriptome responses in Arabidopsis roots exposed to 2,4,6-trinitrotoluene. Plant Physiol 133: 1397–1406[Abstract/Free Full Text]

Gachon CM, Langlois-Meurinne M, Henry Y, Saindrenan P (2005) Transcriptional co-regulation of secondary metabolism enzymes in Arabidopsis: functional and evolutionary implications. Plant Mol Biol 58: 229–245[CrossRef][Web of Science][Medline]

Gautier L, Cope L, Bolstad BM, Irizarry RA (2004) affy: Analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20: 307–315[Abstract/Free Full Text]

Grubb CD, Abel S (2006) Glucosinolate metabolism and its control. Trends Plant Sci 11: 89–100[CrossRef][Web of Science][Medline]

Halkier BA, Gershenzon J (2006) Biology and biochemistry of glucosinolates. Annu Rev Plant Biol 57: 303–333[CrossRef][Medline]

Hirai MY, Sugiyama K, Sawada Y, Tohge T, Obayashi T, Suzuki A, Araki R, Sakurai N, Suzuki H, Aoki K, et al (2007) Omics-based identification of Arabidopsis Myb transcription factors regulating aliphatic glucosinolate biosynthesis. Proc Natl Acad Sci USA 104: 6478–6483[Abstract/Free Full Text]

Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4: 249–264[Abstract]

Johnson JE, Stromvik MV, Silverstein KA, Crow JA, Shoop E, Retzel EF (2003) TableView: portable genomic data visualization. Bioinformatics 19: 1292–1293[Abstract/Free Full Text]

Kiyosue T, Yamaguchi-Shinozaki K, Shinozaki K (1993) Characterization of two cDNAs (ERD11 and ERD13) for dehydration-inducible genes that encode putative glutathione S-transferases in Arabidopsis thaliana L. FEBS Lett 335: 189–192[CrossRef][Web of Science][Medline]

Ko JH, Beers EP, Han KH (2006) Global comparative transcriptome analysis identifies gene network regulating secondary xylem development in Arabidopsis thaliana. Mol Genet Genomics 276: 517–531[CrossRef][Medline]

Levy YY, Mesnage S, Mylne JS, Gendall AR, Dean C (2002) Multiple roles of Arabidopsis VRN1 in vernalization and flowering time control. Science 297: 243–246[Abstract/Free Full Text]

Lillig CH, Schiffmann S, Berndt C, Berken A, Tischka R, Schwenn JD (2001) Molecular and catalytic properties of Arabidopsis thaliana adenylyl sulfate (APS)-kinase. Arch Biochem Biophys 392: 303–310[CrossRef][Web of Science][Medline]

Loeffler C, Berger S, Guy A, Durand T, Bringmann G, Dreyer M, von Rad U, Durner J, Mueller MJ (2005) B1-phytoprostanes trigger plant defense and detoxification responses. Plant Physiol 137: 328–340[Abstract/Free Full Text]

Manfield IW, Jen CH, Pinney JW, Michalopoulos I, Bradford JR, Gilmartin PM, Westhead DR (2006) Arabidopsis Co-expression Tool (ACT): web server tools for microarray-based gene expression analysis. Nucleic Acids Res 34: W504–W509[Abstract/Free Full Text]

Niyogi KK, Fink GR (1992) Two anthranilate synthase genes in Arabidopsis: defense-related regulation of the tryptophan pathway. Plant Cell 4: 721–733[Abstract/Free Full Text]

Obayashi T, Kinoshita K, Nakai K, Shibaoka M, Hayashi S, Saeki M, Shibata D, Saito K, Ohta H (2007) ATTED-II: a database of co-expressed genes and cis elements for identifying co-regulated gene groups in Arabidopsis. Nucleic Acids Res 35: D863–D869[Abstract/Free Full Text]

Persson S, Wei H, Milne J, Page GP, Somerville CR (2005) Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proc Natl Acad Sci USA 102: 8633–8638[Abstract/Free Full Text]

Piotrowski M, Schemenewitz A, Lopukhina A, Muller A, Janowitz T, Weiler EW, Oecking C (2004) Desulfoglucosinolate sulfotransferases from Arabidopsis thaliana catalyze the final step in the biosynthesis of the glucosinolate core structure. J Biol Chem 279: 50717–50725[Abstract/Free Full Text]

R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna

Redman JC, Haas BJ, Tanimoto G, Town CD (2004) Development and evaluation of an Arabidopsis whole genome Affymetrix probe array. Plant J 38: 545–561[CrossRef][Web of Science][Medline]

Saito K, Hirai MY, Yonekura-Sakakibara K (2008) Decoding genes with coexpression networks and metabolomics: ‘majority report by precogs’. Trends Plant Sci 13: 36–43[CrossRef][Web of Science][Medline]

Sawa S, Demura T, Horiguchi G, Kubo M, Fukuda H (2005) The ATE genes are responsible for repression of transdifferentiation into xylem cells in Arabidopsis. Plant Physiol 137: 141–148[Abstract/Free Full Text]

Schuhegger R, Rauhut T, Glawischnig E (2007) Regulatory variability of camalexin biosynthesis. J Plant Physiol 164: 636–644[CrossRef][Web of Science][Medline]

Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13: 2498–2504[Abstract/Free Full Text]

Stuart JM, Segal E, Koller D, Kim SK (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302: 249–255[Abstract/Free Full Text]

Trivedi P, Edwards JW, Wang J, Gadbury GL, Srinivasasainagendra V, Zakharkin SO, Kim K, Mehta T, Brand JP, Patki A, et al (2005) HDBStat!: a platform-independent software suite for statistical analysis of high dimensional biology data. BMC Bioinformatics 6: 86[CrossRef][Medline]

Ubeda-Tomas S, Edvardsson E, Eland C, Singh SK, Zadik D, Aspeborg H, Gorzsas A, Teeri TT, Sundberg B, Persson P, et al (2007) Genomic-assisted identification of genes involved in secondary growth in Arabidopsis utilising transcript profiling of poplar wood-forming tissues. Physiol Plant 129: 415–428[CrossRef]

Wagner U, Edwards R, Dixon DP, Mauch F (2002) Probing the diversity of the Arabidopsis glutathione S-transferase gene family. Plant Mol Biol 49: 515–532[CrossRef][Web of Science][Medline]

Wei H, Persson S, Mehta T, Srinivasasainagendra V, Chen L, Page GP, Somerville C, Loraine A (2006) Transcriptional coordination of the metabolic network in Arabidopsis. Plant Physiol 142: 762–774[Abstract/Free Full Text]

Wilkinson M, Schoof H, Ernst R, Haase D (2005) BioMOBY successfully integrates distributed heterogeneous bioinformatics Web Services. The PlaNet exemplar case. Plant Physiol 138: 5–17[Abstract/Free Full Text]

Wille A, Zimmermann P, Vranova E, Furholz A, Laule O, Bleuler S, Hennig L, Prelic A, von Rohr P, Thiele L, et al (2004) Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol 5: R92[CrossRef][Medline]

Zhang P, Foerster H, Tissier CP, Mueller L, Paley S, Karp PD, Rhee SY (2005) MetaCyc and AraCyc. Metabolic pathway databases for plant research. Plant Physiol 138: 27–37[Abstract/Free Full Text]

Zhong W, Sternberg PW (2006) Genome-wide prediction of C. elegans genetic interactions. Science 311: 1481–1484[Abstract/Free Full Text]

Zimmermann P, Hirsch-Hoffmann M, Hennig L, Gruissem W (2004) GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiol 136: 2621–2632[Abstract/Free Full Text]




This article has been cited by other articles:


Home page
Plant Physiol.Home page
M. Mutwil, B. Usadel, M. Schutte, A. Loraine, O. Ebenhoh, and S. Persson
Assembly of an Interactive Correlation Network for the Arabidopsis Genome Using a Novel Heuristic Clustering Algorithm
Plant Physiology, January 1, 2010; 152(1): 29 - 43.
[Abstract] [Full Text] [PDF]


Home page
DNA ResHome page
T. Obayashi and K. Kinoshita
Rank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression
DNA Res, October 1, 2009; 16(5): 249 - 260.
[Abstract] [Full Text] [PDF]


Home page
Mol PlantHome page
M. Mutwil, C. Ruprecht, F. M. Giorgi, M. Bringmann, B. Usadel, and S. Persson
Transcriptional Wiring of Cell Wall-Related Genes in Arabidopsis
Mol Plant, September 1, 2009; 2(5): 1015 - 1024.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
T.-H. Lee, Y.-K. Kim, T. T. M. Pham, S. I. Song, J.-K. Kim, K. Y. Kang, G. An, K.-H. Jung, D. W. Galbraith, M. Kim, et al.
RiceArrayNet: A Database for Correlating Gene Expression from Transcriptome Profiling, and Its Application to the Analysis of Coexpressed Genes in Rice
Plant Physiology, September 1, 2009; 151(1): 16 - 33.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
K. Vandepoele, M. Quimbaya, T. Casneuf, L. De Veylder, and Y. Van de Peer
Unraveling Transcriptional Control in Arabidopsis Using cis-Regulatory Elements and Coexpression Networks
Plant Physiology, June 1, 2009; 150(2): 535 - 546.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
S. M. Brady and N. J. Provart
Web-Queryable Large-Scale Data Sets for Hypothesis Generation in Plant Biology
PLANT CELL, April 1, 2009; 21(4): 1034 - 1051.
[Abstract] [Full Text] [PDF]


Home page
Plant Cell PhysiolHome page
Y. Sawada, K. Akiyama, A. Sakata, A. Kuwahara, H. Otsuki, T. Sakurai, K. Saito, and M. Y. Hirai
Widely Targeted Metabolomics Based on Large-Scale MS/MS Data for Elucidating Metabolite Accumulation Patterns in Plants
Plant Cell Physiol., January 1, 2009; 50(1): 37 - 47.
[Abstract] [Full Text] [PDF]


This Article
Free via Open Access: OA
Right arrow OA Abstract
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrowOA All Versions of this Article:
147/3/1004    most recent
pp.107.115535v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via CrossRef
Right arrow Citing Articles via Web of Science (1)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Srinivasasainagendra, V.
Right arrow Articles by Loraine, A. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Srinivasasainagendra, V.
Right arrow Articles by Loraine, A. E.
Agricola
Right arrow Articles by Srinivasasainagendra, V.
Right arrow Articles by Loraine, A. E.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
ASPB Publications PLANT PHYSIOLOGY® THE PLANT CELL
Copyright © 2008 by the American Society of Plant Biologists