|
|
||||||||
|
First published online August 18, 2006; 10.1104/pp.106.080358 Plant Physiology 142:762-774 (2006) © 2006 American Society of Plant Biologists OPEN ACCESS ARTICLE
Transcriptional Coordination of the Metabolic Network in Arabidopsis1,[W],[OA]Department of Biostatistics (H.W., T.M., V.S., L.C., G.P.P., A.L.) and Department of Genetics (A.L.), University of Alabama, Birmingham, Alabama 35294; Department of Plant Biology, Carnegie Institution, Stanford, California 94305 (S.P., C.S.); and Department of Biological Sciences, Stanford University, Stanford, California 94305 (C.S.)
Patterns of coexpression can reveal networks of functionally related genes and provide deeper understanding of processes requiring multiple gene products. We performed an analysis of coexpression networks for 1,330 genes from the AraCyc database of metabolic pathways in Arabidopsis (Arabidopsis thaliana). We found that genes associated with the same metabolic pathway are, on average, more highly coexpressed than genes from different pathways. Positively coexpressed genes within the same pathway tend to cluster close together in the pathway structure, while negatively correlated genes typically occupy more distant positions. The distribution of coexpression links per gene is highly skewed, with a small but significant number of genes having numerous coexpression partners but most having fewer than 10. Genes with multiple connections (hubs) tend to be single-copy genes, while genes with multiple paralogs are coexpressed with fewer genes, on average, than single-copy genes, suggesting that the network expands through gene duplication, followed by weakening of coexpression links involving duplicate nodes. Using a network-analysis algorithm based on coexpression with multiple pathway members (pathway-level coexpression), we identified and prioritized novel candidate pathway members, regulators, and cross pathway transcriptional control points for over 140 metabolic pathways. To facilitate exploration and analysis of the results, we provide a Web site (http://www.transvar.org/at_coexpress/analysis/web) listing analyzed pathways with links to regression and pathway-level coexpression results. These methods and results will aid in the prioritization of candidates for genetic analysis of metabolism in plants and contribute to the improvement of functional annotation of the Arabidopsis genome.
The advent of whole-system approaches, such as DNA chips and metabolomics, have created new opportunities for studying how metabolic pathways are coordinated to meet cellular demands (Sweetlove and Fernie, 2005 k , where f is the frequency of nodes having k connections. Although the goodness of fit of power law functions is sometimes controversial, it is clear that most biological networks include a small but significant number of nodes (e.g. genes or proteins) that have a large number of connections, but most nodes have very few (for review, see Albert, 2005
AraCyc (http://Arabidopsis.org/tools/aracyc/) is a database and visualization system for metabolic pathways in Arabidopsis developed by The Arabidopsis Information Resource (TAIR). The first version of the AraCyc database was based on the MetaCyc compendium of known biochemical pathways and output from the Pathologic software, which uses keyword matching to assign gene products to individual pathway steps recorded in MetaCyc. Since then, AraCyc has undergone continuous improvement through manual editing and literature-based curation (Mueller et al., 2003
In previous work, we described a method that uses coexpression relationships inferred from regression analysis of DNA microarray data to identify new players in biological pathways (Persson et al., 2005
Genes Belonging to the Same Pathway Are Coexpressed
We used publicly available data from 486 quality-screened ATH1 array hybridizations to analyze coexpression patterns for metabolic pathway genes in Arabidopsis. The ATH1 expression microarray from Affymetrix contains over 22,000 probe sets that hybridize to one or more Arabidopsis genes (Redman et al., 2004
It is generally expected that gene products that are regulated at the level of mRNA abundance and that collaborate in a shared function or pathway are likely to be coexpressed. To assess whether this was the case with Arabidopsis metabolic pathways, we compared r2 and p values obtained from linear regressions performed between genes annotated as belonging to the same or different metabolic pathways in AraCyc. In general, the population of within-pathway comparisons contained a higher proportion of high-confidence (low p and high r2) results than did comparisons involving genes from different pathways (Table I ). Thus, we found that genes annotated as belonging to the same pathway tend to be more tightly coexpressed than genes from different metabolic pathways, a result that is consistent with results obtained from similar studies in yeast (Ihmels et al., 2004b
Core Metabolic Pathways Display Tighter Level of Transcriptional Coordination To investigate whether coexpression levels of genes within pathways vary from one pathway to another, we used a random sampling approach to identify pathways that contained above-average numbers of coexpressed gene pairs. For each gene, we created a list of all other genes represented on the ATH1 array and ordered the list by increasing p (or, equivalently, decreasing r2) values. In this scheme, genes within highly coexpressed pathways should appear near the top of each other's coexpression lists and have small ranks. Moreover, the average of their mutual ranks should be unusually small when compared to samples of genes selected at random without regard to their pathway affiliation. To test this, we selected 10,000 random samples of size N for each pathway having N genes, computed the average rank for each sample, and then compared the distribution of average ranks from the samples to the actual average rank obtained for each pathway. The frequency with which we observed average ranks as small or smaller than the actual observed values thus provided an empirically determined, within-pathway coexpression p value for each pathway. Table II presents the most tightly coregulated pathways according to this analysis. These tightly coexpressed pathways were enriched in core metabolic pathways such as glycolysis, tricarboxylic acid (TCA) cycle, and the pentose phosphate pathway, which produce precursors for many other pathways. By contrast, pathways involved in noncore or peripheral biochemical pathways were coexpressed to a lesser degree. A full list of the pathways we analyzed, with links to Web pages for individual genes, pathways, probe sets, and plain-text spreadsheets of regression results, is available at http://www.transvar.org/at_coexpress/analysis/web.
Inferring Coexpressed Genes for Metabolic Pathways We have shown that pathways are enriched for coexpressed genes, a result that is consistent with the commonly held view that genes involved in related functions are expressed in a coordinate fashion. Previously, we used this aspect of transcriptional regulation to identify new members of cellulose biosynthesis pathways. Using large-scale coexpression results for a group of known CESA genes, we identified candidate genes outside the group that were coexpressed with some or all group members. In this earlier analysis, we observed that although genes in cellulose biosynthesis pathways typically appear near the top of each other's coexpression lists, often there are many more genes that have a higher ranking in terms of coexpression than the other group members. We found that this was also the case for the AraCyc coexpression data set. We found that individual pathway genes are typically coexpressed with tens or sometimes hundreds of genes even at relatively stringent p or r2 value coexpression cutoffs and that often these nonpathway genes outrank other members of the pathway in terms of coexpression (Fig. 2 ).
Previously, we narrowed the field of candidate genes for genetic analyses based on the number of CESA bait genes with which the candidates were coexpressed. That is, we chose candidate genes that were tightly coexpressed with as many bait genes as possible. Here, we present a more general version of this approach that uses both the coexpression set size and p values to select and rank candidates (Fig. 3A ). The method computes a network structure in which genes are considered linked when their linear regression p and r2 values meet a user-defined threshold. It then identifies genes within the network that are linked with multiple members of a given pathway; using graph analysis terminology, this is equivalent to finding genes whose neighborhood of connected genes include multiple genes in the pathway. Next, it ranks these candidate genes based on the number of connected pathway genes (within-pathway neighborhood size) and resolves ties using the product of regression p values between coexpressed gene bait and candidate genes.
Note that analyzing the network structure in this way does not require that the pathway members themselves be coexpressed (linked) with each other, although this is often the case. In fact, this is a potential strength of the approach in that it can exploit potential redundancies in the system. For example, two isozymes that perform the same pathway step may not necessarily be coexpressed with each other, but they could each require coexpression with a third gene that supplies necessary functionality. Depending on the strength of coexpression, the approach would identify this third gene or any other genes that are connected with multiple genes within the same pathway group. In recognition that this approach is based on coexpression with multiple pathway members, not just single genes, we have termed this approach pathway-level coexpression (PLC) analysis.
We used PLC analysis to survey coexpression relationships for 205 AraCyc pathways, using coexpression p value cutoffs ranging from 1e-40 to 1e-200. Figure 3B summarizes the number of genes identified as being connected to one or more pathway members at different p value cutoffs. A coexpression p value of 1E-80 or better and pathway neighborhoods of two or more pathway genes produces 4,022 candidates connected with 144 pathways. Interestingly, we identified more than 100 genes (using p value cutoff 1E-80) that are coexpressed with pathway neighborhoods containing 15 or more genes. These PLC-identified genes were from two of the most highly coexpressed pathways: chlorophyll biosynthesis and the Calvin cycle. For the Calvin cycle pathway, the highest-ranking PLC-identified gene candidates included several putative chloroplast proteins, including four of unknown function and several more with predicted functions related to electron transport and photosynthesis, such as iron binding and ferrochelatase activity. The highest-ranking PLC result for the chlorophyll biosynthesis pathway is GUN4 (AT3G59400), a well-studied regulator of chlorophyll biosynthesis and a key player in plastid-to-nuclear signal transduction (Larkin et al., 2003
Further manual and computational inspection of PLC-identified candidate genes reveals many more that appear to be good candidates for biologically meaningful coordinate expression. The flavonoid biosynthesis pathway (PWY1F-FLAVSYN), one of the best-studied pathways in plant secondary metabolism, provides a representative example (Winkel-Shirley, 2001
The Trp biosynthesis pathway provides another illustration of how coexpression analysis can lead to new hypotheses regarding gene function. Two genes in the pathway are coexpressed with At3g26830 (PAD3), which encodes a cytochrome p450 monoxygenase and was recently shown to catalyze the final step in the camalexin biosynthesis pathway (Zhou et al., 1999
As described above, PLC analysis ranks candidate genes first by the number of coexpressed partners from the bait pathway and second by the p values of the coexpression relationships. Further selection of candidate genes is possible using annotations derived from independent sources unrelated to coexpression data, such as functional information inferred from sequence homology or curated from the literature. For this study, we used GO annotations as a convenient summary of known and predicted functional information for Arabidopsis gene products (Harris et al., 2004 The GO is a structured vocabulary of terms that that organizes knowledge of gene products according to their molecular function, biological role, or cellular localization. GO annotations are associations between terms and gene products, and each GO annotation is tagged with an evidence code indicating the annotation source. The GO annotations can aid the evaluation of PLC analysis results in two ways: First, they can direct attention to particular classes of coexpressed genes, such as transcription factors or protein kinases, and second, they can allow further prioritization of candidate genes based on the similarity of annotations with bait genes from the target pathway. We found that terms appearing frequently among annotations associated with the bait genes also appear frequently among the genes identified in the PLC analysis.
For flavonoid biosynthesis, the term chloroplast is one of the most abundantly used terms for genes within the pathway as well as for the pool of candidate genes identified by PLC analysis (Supplemental Table S1). This annotation derives from electronic annotation by TargetP, a program that uses N-terminal sequence information to predict subcellular localization (Emanuelsson et al., 2000 The GO also includes terms indicating that the process, function, or cellular localization of the annotated gene product is currently unknown. We found that a large number of candidate genes identified through PLC analysis are annotated with the unknown function GO terms including 1,205 for biological process unknown; 1,021 for molecular function unknown; and 32 for cellular component unknown out of a total of 4,022 PCL-identified genes. If coexpression patterns can imply functional information, then large-scale coexpression analysis as described here has the potential to contribute to functional annotation of Arabidopsis gene products.
Manual inspection of the coexpression relationships between genes in the same pathway reveals that the distribution of connections between pathway genes appears nonrandom with respect to pathway structure and reaction order. As an example, Figure 4A shows a schematic view of positively coexpressed genes from the glycolysis pathway. Subsets of genes in the pathway form three groups of coexpressed genes, one of which involves nearly every pathway step. Other groups involve genes that catalyze adjacent or nearby reaction steps, which may reflect a general trend. To investigate this possibility, we plotted the number of pathway steps that separate coexpressed gene pairs for pathways containing at least six pathway steps (for a list of these pathways, see Supplemental Table S2). Figure 4B summarizes the results; positive coexpression typically involved genes associated with adjacent pathway steps. Negative coexpression, on the other hand, more often involved genes separated by two to three pathway steps.
Topological Features of the Metabolic Network To explore the topology of the metabolic network, we examined the distribution of linked nodes (genes) in networks based on coexpression relationships (Fig. 5, A and B ). We found that the distribution of links per node in the coexpression network of metabolic genes in Arabidopsis is highly skewed, with most genes having a small number of connections and a small but significant number having many connections (Fig. 5C). For example, at coexpression p value cutoff 1E-80, over 70% and 95% of linked nodes in the positive and negative AraCyc coexpression networks are connected to 10 or fewer genes/nodes. At this same p value threshold, both positively and negatively connected genes formed large networks of interconnected genes, but many genes (over half) lacked coexpression connections with other genes in the AraCyc data set. Overall connectivity within the positive network was higher than for the negative network: network density (actual links divided by possible links) was larger for the positive coexpression network at coexpression p value cutoffs ranging from 10E-40 to 10E-120.
We also investigated correlation between connectivity in the coexpression network (i.e. links per node) and number of paralogous genes per node. We used BLASTp to identify homologous sequences and then asked whether genes with larger numbers of paralogs in the Arabidopsis genome tend to have more or fewer coexpression connections with other genes. For the positive coexpression network (coexpression p value < 1E-80), we found that 65% of the 139 hub genes having 20 or more connections were single-copy genes, but only 37% of genes having fewer than 20 connections were single-copy genes. To assess the significance of this difference, we used random sampling to estimate the probability of obtaining such a high percentage of single-copy genes among the 139 hub genes purely by chance. We generated 100,000 random samples, computed the percentage of single-copy genes for each sample, and found that only four of the random samples contained more than 50% single-copy genes. Thus, we find that the relationship between uniqueness in the genome and status as a hub gene is highly significant. Furthermore, single-copy genes have an average of 11.5 connections, but genes with paralogous copies have an average of five connections per gene. We tested whether this difference in average links per gene is significant using the Wilcoxon rank sum test, which allows an assessment of whether or not two samples come from the same underlying distribution. Using this test, we determined that, on average, genes with paralogs are significantly less well connected (p value = 1.2E-5) than genes with no paralogs. We therefore find that highly connected nodes tend to be single-copy genes, whereas less-well connected genes tend to be present in multiple copies in the genome.
Because pathways are interconnected in the sense that many utilize intermediate metabolites or end products from other pathways, it is likely that some pathways include genes that are highly coexpressed with genes in other pathways. We expect that this would be particularly common for pathways that supply precursors for multiple processes, such as glycolysis or the TCA cycle. We found that this was indeed the case. Figure 6 shows a heatmap visualization in which each cell represents the degree of coexpression (high, medium, and low) between pairs of genes from the different pathways in the corresponding rows and columns. Several pathways contain genes that are highly coexpressed both within and across pathway boundaries. For instance, genes in the photosynthesis light reaction pathway are highly coexpressed with each other and with genes in other pathways, all of which utilize common metabolites, including malate, 3-phosphoglycerate, and Fru-1,6-bisphosphate. Figure 7 diagrams pathways that are linked to the photosynthesis light reaction pathway via coexpressed gene pairs; these include the photorespiration pathway (seven genes), gluconeogenesis (four genes), and the Calvin cycle (four genes). This transcriptional coordination of genes across pathway boundaries suggests corresponding coordination of metabolic flow.
To investigate coregulatory connections between metabolic pathways in greater detail, we computed PLC networks in which each node in the network represents an individual pathway and connections between nodes represent pairs of coexpressed genes in which each member of the pair belongs to one, but not both, of the connected pathway nodes (Fig. 8 ). We considered negative and positive coexpression links separately because of the different distribution of high-confidence coexpression relationships for positive versus negative coexpression. In this scheme, each node-to-node connection represents a high degree of cross pathway coexpression. We found that pathways with the greatest number of internal coexpression connections are also among the most tightly coregulated across pathway boundaries. These included several core metabolic processes relating to energy metabolism, including the Calvin cycle, gluconeogenesis, Fru degradation, and sorbitol and acetate fermentation (Fig. 8). We found that pathways with large numbers of positive cross pathway connections also possess large numbers of negative cross pathway coexpression links (compare Fig. 8, A and B).
Using linear regression p and r2 values to identify and rank coexpression relationships, we showed that, on average, genes involved in the same metabolic pathway are coexpressed to a greater degree than genes involved in different pathways. However, most genes in the AraCyc data set are coexpressed with tens to hundreds of genes, only a small number of which are annotated as belonging to the same pathway or pathways. If understanding a pathway of interest is the main analytical focus, then a method of narrowing the field of candidates is required. To facilitate this type of analysis, we developed a PLC analysis approach that identifies and ranks candidate genes based on coexpression with groups of pathway genes and the relative strength (p and/or r2 values) of these coexpression relationships (Fig. 3A).
We used an earlier version of the PLC analysis to identify novel genes involved in cellulose biosynthesis in Arabidopsis (Persson et al., 2005 Using random sampling, we computed empirical p values assessing the degree to which genes in each pathway are coexpressed with each other. We found that core metabolic pathways exhibited an unusually high level of within-pathway coexpression. We also identified pathways that possess multiple positive and negative coexpression links across pathway boundaries. These results are based on over 400 individual array hybridizations involving many different cell types, developmental stages, and experimental treatments. Coexpression, in this setting, means that regardless of the experimental condition, high (or low) expression of one member of a coexpression pair (or group) predicts similarly high (or low) expression of the other group members. We do not suggest that the coexpressed genes are expressed in every cell type, only that when they are expressed, they are expressed together. We suggest that the groups of coregulated genes are involved in maintaining and regulating metabolic flow within and across pathways. In addition, we saw that not all genes annotated as belonging to a pathway are involved in within- or across-pathway coexpression relationships. These genes may serve specialized functions that are not apparent when hundreds of experiments are considered. Alternatively, some of these genes may be incorrectly assigned in AraCyc, possibly reflecting the computational origins for AraCyc pathway annotations.
We found that the distribution of positive and negative coexpression relationships is highly skewed; the majority of genes have few links but a small but significant number of genes are very well connected. Similarly skewed distributions have been observed in a number of different biological networks, and it has been proposed that these networks arise in an incremental fashion via two mechanisms: duplication of components of the existing network and random mutation (for review, see Albert, 2005
We found that the majority of highly connected (20 or more expression links) were single-copy genes. In yeast, single-copy genes exhibit a higher proportion of lethal or reduced-fitness phenotypes than do genes with duplicates (Gu et al., 2003
An earlier study from Wille et al. (2004) We believe that the results and methods presented here can aid scientists in choosing candidate genes for genetic analysis based on their position in the coexpression network. We recommend that researchers seeking to characterize any group of functionally related genes perform group- or PLC analysis to identify key players within and outside the group whenever there is good reason to expect that membership in the group will imply coexpression. As demonstrated here, the abundance of microarray expression data for Arabidopsis now available makes this analysis both feasible and productive. Furthermore, the results from coexpression analysis could help to improve annotation of the Arabidopsis genome. Indeed, we propose that lists of high-confidence coexpression partners could be added to gene-level Web pages at sites such as TAIR, providing a new dimension of functional annotation for the Arabidopsis genome.
Data Files AraCyc data are from version 2.1 of the database as available in August, 2005. Data were obtained from TAIR (www.arabidopsis.org) as a flat file dump that listed accessions for 221 different pathways associated with 1,612 genes. Affymetrix ATH1 GeneChip probe set and target gene information are from an annotations data file downloaded from the Affymetrix Web site in August, 2005 and dated June 20, 2005. GO annotations are from a file downloaded from TAIR's ftp site September, 2005. Copies of all primary data files are available upon request.
To map genes onto probe sets and vice versa, we cross-referenced gene identifiers from the AraCyc database flat file against the AGI and Representative Public ID fields in the Affymetrix ATH1 probe set annotations file. This mapping produced a list of 1,488 probe sets. We purged redundant and promiscuous probe sets, i.e. genes mapped to multiple probe sets and probe sets recognizing more than one gene, to create a list of 1,330 AraCyc-associated probe sets. In a few cases, an AraCyc gene identifier was not represented on the ATH1 array. Visualization of a randomly selected subset of these using the Integrated Genome Browser, which shows the location of ATH1 probe sets alongside Arabidopsis (Arabidopsis thaliana) genome version 5 gene annotations, revealed that these genes are not interrogated on the ATH1 array, most likely because they appeared in the public databases after the ATH1 array entered production. The Integrated Genome Browser is available at http://www.affymetrix.com/support/developer/tools/download_igb.affx. A list of all pathways, probe sets, and gene identifiers is available at http://www.transvar.org/at_coexpress/analysis/web.
We obtained 553 CEL files for Affymetrix ATH1 array experiments from the Nottingham Arabidopsis Stock Center AffyWatch subscription service. A number of the files obtained were duplicates; after removing these, we processed the remaining CEL files using the robust multichip average algorithm implementation in Bioconductor (Gentleman et al., 2004
For each probe set (gene) associated with AraCyc, use simple linear regression to compare its vector of N expression values
For each AraCyc-associated probe set, we sorted its regression results by increasing p value and computed the average of the mutual ranks for each pathway probe set in the sorted lists of the other pathway probe sets. We used random sampling of probe sets to compute an empirical distribution of average ranks: for each pathway with M probe sets, we selected a random sample of size M from the 1,330 AraCyc probe sets in the study and computed its average rank. We repeated the sampling procedure 10,000 times for each M to develop an empirical distribution of average ranks for pathways including M genes. The average rank for each pathway was then compared to the empirical distribution of average ranks for a pathway of that size to estimate the p value for within-pathway coexpression. The heatmaps showing within- and across-pathway coexpression patterns were generated using matrix2png (Pavlidis and Noble, 2003
Pathway- or group-level coexpression identifies and ranks genes based on their coexpression with a group of genes, such as a metabolic pathway. The procedure operates as follows: Select a subset of functionally related bait genes, B = {g1, g2, .., gM} (e.g. all the members of a metabolic pathway) from the larger set G of all genes gi and gj represented on an expression microarray, e.g. ATH1. For every pairwise comparison between gi and gj, where one or both are in B, perform linear regression between gi and gj, yielding p value pij and coefficient of determination rij2. Use the set of p and r2 values obtained from the pairwise regressions to construct an undirected graph, where an edge eij connects gi and gj whenever pij < pt and rij2 > rt2 for user-defined thresholds pt and rt2. Any two genes gi and gj that share an edge (link) in the resulting network graph are considered to be coexpressed. Using the coexpression network graph, identify every candidate gene ci where ci is coexpressed with two or more bait genes. Define Bi = {g1, g2, .., gK} as the set of K > 1 bait genes coexpressed with candidate gene ci and P = {pi1, pi2, .., piK} as the set of p values associated with coexpressed gene pairs {(ci, g1), (ci, g2),..,(ci, gK)}. To prioritize candidates for manual analysis, order the list of candidate genes by the relative sizes of their bait gene sets |Bi|, such that if |Bi| > |Bj| for ci and cj, then ci is listed before cj. When |Bi| = |Bj|, list ci first whenever the product of its coexpression p values (p
We used BLASTp to search the 1,330 Aracyc pathway genes used in the study against a database of Arabidopsis protein sequences obtained from TAIR. We considered hits as paralogs when the query and subject shared greater than 70% amino acid sequence identity across 90% or more of both sequences.
Networks of coexpressed genes were assembled from pairwise linear regression results comparing AraCyc metabolic pathway genes to each other. We analyzed a number of different networks, which varied by different linear regression p and r2 value thresholds used to define coexpression. Depending on the analysis, pathways were considered connected when they shared at least pairs of coexpressed genes, where neither member of a pair was in both pathways and Np varied from two to seven, depending on the analysis. Coexpression networks were analyzed using the networkx Python toolkit for computing on graphs (https://networkx.lanl.gov/) and visualized using the Cytoscape network visualization software program (Shannon et al., 2003
The following materials are available in the online version of this article. Supplemental Table S1. GO terms for the flavonoid biosynthesis pathway and genes identified using the PLC algorithm. Supplemental Table S2. List of pathways analyzed in Figure 4B. Supplemental Figure S1. Fully labeled heatmap showing coexpression patterns within and across pathways.
The authors thank Sue Rhee, Peifen Zhang, and the TAIR AraCyc group for providing AraCyc database files and for thoughtful comments on the study. We also thank Alistair Fernie for comments on the manuscript. Received March 22, 2006; accepted August 2, 2006; published August 18, 2006.
1 This work was supported by the National Science Foundation (grant no. 0217651), the U.S. Department of Energy (grant no. DEFG0203ER20133), and a Swedish Research Council Fellowship (grant no. 62320044254 to S.P.).
2 These authors contributed equally to the paper. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Ann Loraine (aloraine{at}uab.edu).
[W] The online version of this article contains Web-only data.
[OA] Open Access articles can be viewed online without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.106.080358 * Corresponding author; e-mail aloraine{at}uab.edu; fax 2059752540.
Albert R (2005) Scale-free networks in cell biology. J Cell Sci 118: 49474957 Albert R, Jeong H, Barabasi AL (2000) Error and attack tolerance of complex networks. Nature 406: 378382[CrossRef][Medline] Carlson MR, Zhang B, Fang Z, Mischel PS, Horvath S, Nelson SF (2006) Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks. BMC Genomics 7: 40[CrossRef][Medline] Cutler S, McCourt P (2005) Dude, where's my phenotype? Dealing with redundancy in signaling networks. Plant Physiol 138: 558559 Daniel WW (2004) Biostatistics: A Foundation for Analysis in the Health Sciences, Ed 8. Wiley, New York DeRisi JL, Iyer VR, Brown PO (1997) Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278: 680686 Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300: 10051016[CrossRef][Web of Science][Medline] Gandhi TK, Zhong J, Mathivanan S, Karthick L, Chandrika KN, Mohan SS, Sharma S, Pinkert S, Nagaraju S, Periaswamy B, et al (2006) Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets. Nat Genet 38: 285293[CrossRef][Web of Science][Medline] Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5: R80[CrossRef][Medline] Glawischnig E, Hansen BG, Olsen CE, Halkier BA (2004) Camalexin is synthesized from indole-3-acetaldoxime, a key branching point between primary and secondary metabolism in Arabidopsis. Proc Natl Acad Sci USA 101: 82458250 Glazebrook J (2005) Contrasting mechanisms of defense against biotrophic and necrotrophic pathogens. Annu Rev Phytopathol 43: 205227[CrossRef][Web of Science][Medline] Gu Z, Steinmetz LM, Gu X, Scharfe C, Davis RW, Li WH (2003) Role of duplicate genes in genetic robustness against null mutations. Nature 421: 6366[CrossRef][Medline] Gutierrez RA, Shasha DE, Coruzzi GM (2005) Systems biology for the virtual plant. Plant Physiol 138: 550554 Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, et al (2004) The gene ontology (GO) database and informatics resource. Nucleic Acids Res 32: D258D261 Ihmels J, Bergmann S, Barkai N (2004a) Defining transcription modules using large-scale gene expression data. Bioinformatics 20: 19932003 Ihmels J, Levy R, Barkai N (2004b) Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae. Nat Biotechnol 22: 8692[CrossRef][Web of Science][Medline] Jeong H, Mason SP, Barabasi AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411: 4142[CrossRef][Medline] Larkin RM, Alonso JM, Ecker JR, Chory J (2003) GUN4, a regulator of chlorophyll synthesis and intracellular signaling. Science 299: 902906 Minorsky PV (2003) Frontiers of plant cell biology: signals and pathways, system-based approaches 22nd Symposium in Plant Biology (University of California-Riverside). Plant Physiol 132: 428435 Mueller LA, Zhang P, Rhee SY (2003) AraCyc: a biochemical pathway database for Arabidopsis. Plant Physiol 132: 453460 Pavlidis P, Noble WS (2003) Matrix2png: a utility for visualizing matrix data. Bioinformatics 19: 295296 Persson S, Wei H, Milne J, Page GP, Somerville CR (2005) Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets. Proc Natl Acad Sci USA 102: 86338638 Pilpel Y, Sudarsanam P, Church GM (2001) Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet 29: 153159[CrossRef][Web of Science][Medline] Redman JC, Haas BJ, Tanimoto G, Town CD (2004) Development and evaluation of an Arabidopsis whole genome Affymetrix probe array. Plant J 38: 545561[CrossRef][Web of Science][Medline] Rodgers JL, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42: 5966[CrossRef] Schuhegger R, Nafisi M, Mansourova M, Petersen BL, Olsen CE, Svatos A, Halkier BA, Glawischnig E (2006) CYP71B15 (PAD3) catalyzes the final step in camalexin biosynthesis. Plant Physiol 141: 12481254 Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13: 24982504 Stuart JM, Segal E, Koller D, Kim SK (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302: 249255 Sweetlove LJ, Fernie AR (2005) Regulation of metabolic networks: understanding metabolic complexity in the systems biology era. New Phytol 168: 924[CrossRef][Web of Science][Medline] Tohge T, Nishiyama Y, Hirai MY, Yano M, Nakajima J, Awazuhara M, Inoue E, Takahashi H, Goodenowe DB, Kitayama M, et al (2005) Functional genomics by integrated analysis of metabolome and transcriptome of Arabidopsis plants over-expressing an MYB transcription factor. Plant J 42: 218235[CrossRef][Web of Science][Medline] Trivedi P, Edwards JW, Wang J, Gadbury GL, Srinivasasainagendra V, Zakharkin SO, Kim K, Mehta T, Brand JP, Patki A, et al (2005) HDBStat!: a platform-independent software suite for statistical analysis of high dimensional biology data. BMC Bioinformatics 6: 86[CrossRef][Medline] Watts DJ, Strogatz SH (1998) Collective dynamics of "small-world" networks. Nature 393: 440442[CrossRef][Medline] Wille A, Zimmermann P, Vranova E, Furholz A, Laule O, Bleuler S, Hennig L, Prelic A, von Rohr P, Thiele L, et al (2004) Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol 5: R92[CrossRef][Medline] Winkel-Shirley B (2001) Flavonoid biosynthesis: a colorful model for genetics, biochemistry, cell biology, and biotechnology. Plant Physiol 126: 485493 Zhou N, Tootle TL, Glazebrook J (1999) Arabidopsis PAD3, a gene required for camalexin biosynthesis, encodes a putative cytochrome P450 monooxygenase. Plant Cell 11: 24192428 This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY® | THE PLANT CELL | |
|---|---|---|---|