Pearce, Simon and Ferguson, Alison and King, John and Wilson, Zoe A. (2015) FlowerNet: a gene expression correlation metwork for anther and pollen development. Plant Physiology, 167 (4). pp. 1717-1730.

Floral formation, in particular anther and pollen development, is a complex biological process with critical importance for seed set and for targeted plant breeding. Many key transcription factors regulating this process have been identi ﬁ ed; however, their direct role remains largely unknown. Using publicly available gene expression data from Arabidopsis ( Arabidopsis thaliana ), focusing on those studies that analyze stamen-, pollen, or ﬂ ower-speci ﬁ c expression, we generated a network model of the global transcriptional interactions (FlowerNet). FlowerNet highlights clusters of genes that are transcriptionally coregulated and therefore likely to have interacting roles. Focusing on four clusters, and using a number of data sets not included in the generation of FlowerNet, we show that there is a close correlation in how the genes are expressed across a variety of conditions, including male-sterile mutants. This highlights the important role that FlowerNet can play in identifying new players in anther and pollen development. However, due to the use of general ﬂ oral expression data in FlowerNet, it also has broad application in the characterization of genes associated with all aspects of ﬂ oral development and reproduction. To aid the dissection of genes of interest, we have made FlowerNet available as a community resource (http://www.cpib.ac.uk/ anther). For this resource, we also have generated plots showing anther/ ﬂ ower expression from a variety of experiments: These are normalized together where possible to allow further dissection of the resource.

FlowerNet: A Gene Expression Correlation Network for Anther and Pollen Development 1 [OPEN]   Simon Pearce 2 , Alison Ferguson 2 , John King, and Zoe A. Wilson* Division of Plant Crop Sciences (S.P., A.F., Z.A.W.) and Centre for Plant Integrative Biology (S.P., J.K., Z.A.W.), School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, Leicstershire LE12 5RD, United Kingdom; and School of Mathematical Sciences, University of Nottingham, Nottingham NG7 2RD, United Kingdom (S.P., J.K.) Floral formation, in particular anther and pollen development, is a complex biological process with critical importance for seed set and for targeted plant breeding.Many key transcription factors regulating this process have been identified; however, their direct role remains largely unknown.Using publicly available gene expression data from Arabidopsis (Arabidopsis thaliana), focusing on those studies that analyze stamen-, pollen-, or flower-specific expression, we generated a network model of the global transcriptional interactions (FlowerNet).FlowerNet highlights clusters of genes that are transcriptionally coregulated and therefore likely to have interacting roles.Focusing on four clusters, and using a number of data sets not included in the generation of FlowerNet, we show that there is a close correlation in how the genes are expressed across a variety of conditions, including male-sterile mutants.This highlights the important role that FlowerNet can play in identifying new players in anther and pollen development.However, due to the use of general floral expression data in FlowerNet, it also has broad application in the characterization of genes associated with all aspects of floral development and reproduction.To aid the dissection of genes of interest, we have made FlowerNet available as a community resource (http://www.cpib.ac.uk/ anther).For this resource, we also have generated plots showing anther/flower expression from a variety of experiments: These are normalized together where possible to allow further dissection of the resource.
Anther and pollen development is a complex biological process that is indispensable for the generation of male gametophytes and the production of the next generation in flowering plants.This development includes a series of crucial events that require interactions between gametophytic and sporophytic genes in a cooperative fashion (Goldberg et al., 1993;Ma, 2005).The mature anther contains four lobes, each of which contains meiotic cells surrounded by four somatic cell layers, with the sporophytic tapetum playing an important role in pollen grain development (Goldberg et al., 1993).The tapetum serves as a nutritive tissue, providing metabolites, nutrients, and cell wall precursors for the development of pollen grains.
The use of classical genetic screens has uncovered a large number of genes involved in anther development and in the production of viable pollen.For example, in Arabidopsis (Arabidopsis thaliana), a number of genes encoding putative transcription factors involved in all aspects of anther development have been discovered.APETALA3 (AP3; Jack et al., 1994), SPOROCYTELESS/ NOZZLE (SPL/NZZ; Liu et al., 2009), and EXCESS MICROSPOROCYTES1 (Zhao et al., 2002) have important roles in early anther development and differentiation, while ABORTED MICROSPORE (AMS; Sorensen et al., 2003;Xu et al., 2010), DEFECTIVE IN TAPETAL DEVELOPMENT AND FUNCTION1 (Zhu et al., 2008), DYSFUNCTIONAL TAPETUM1 (DYT1; Zhang et al., 2006), MALE STERILITY1 (MS1; Wilson et al., 2001;Alves-Ferreira et al., 2007;Ito et al., 2007;Yang et al., 2007a), MYB33/MYB65 (Millar and Gubler, 2005), and MYB80/MYB103 (Higginson et al., 2003;Zhang et al., 2007;Zhu et al., 2010) are involved in tapetum and pollen wall development (Wilson and Zhang, 2009).MYB26 (Steiner-Lange et al., 2003;Yang et al., 2007b) and RECEPTOR-LIKE PROTEIN KINASE2 (RPK2; Mizuno et al., 2007a) are involved in the later stages of anther dehiscence and pollen release.Although a number of the main regulators are known and a large number of genes directly involved in anther and pollen development have been identified, the means by which they interact and function in this process remain largely uncharacterized.
In this work, we try to address this knowledge gap by uncovering groups of genes involved in floral development, specifically focusing on anther and pollen development, through the creation of a coexpression network.This coexpression network connects genes expressed during floral development by their transcriptional regulation.Coexpressed genes have an increased likelihood of being involved in the same developmental or biochemical pathways; therefore, the correlation of gene expression is a powerful approach to analyze large data sets to identify genes involved in the same functional pathway (Peng and Weselake, 2011;Wang et al., 2012).This approach has been successfully applied to two state-dependent sets of interactions associated with seed dormancy or germination, with good correlation of transcriptional regulators of known seed dormancy and germination regulatory genes (Bassel et al., 2011).
With the recent advances in postgenomic technologies, a large number of genome-wide transcriptomic data sets have been generated, often to look at specific changes such as a transcription factor mutant compared with a wild type.Deposition of these data sets into publicly accessible online databases enables other researchers to analyze these collated data and uncover novel information, which may be tangential to the goal of the original experiment.In this study, we used publicly available gene expression data deposited in the National Center for Biotechnology Information's Gene Expression Omnibus (Edgar et al., 2002;Barrett et al., 2013), choosing microarray data sets that included flowers, buds, anthers, or pollen materials to generate a coexpression network.Through generating the network, compact clusters of coexpressed genes were found (i.e.sets of genes having similar expression patterns across all the samples).This clustering enables the identification of novel genes that may be involved in specific biological processes based on known components within the cluster.
Additionally, we reanalyzed two previously conducted two-color microarray experiments associated with anther and pollen development (Alves-Ferreira et al., 2007;Xu et al., 2010) in order to compare gene expression across these time series as well as between the mutants.Along with the Affymetrix microarrays used to generate the correlation network, these three data sets give a clear indication of the spatial and temporal expression of anther-related genes.Plots have been created for each gene in the correlation network and are available online at http://www.cpib.ac.uk/anther.
In this work, we have generated a valuable tool for the analysis of genes involved in anther and pollen development and show how this can be used to identify new players in anther and pollen development.However, the data used for the generation of the correlation network are from all floral organs and developmental stages; therefore, FlowerNet also has broader application in the characterization of all genes associated with aspects of plant reproduction and flowering.

Gene Expression Plots
Arabidopsis anther, stamen, and bud microarray data sets were selected that comprised wild-type or mutant samples associated with anther and pollen development; all mutants selected displayed male-sterile phenotypes linked to a failure of pollen formation or dehiscence.The Affymetrix microarrays were renormalized together (see "Materials and Methods") to make the data comparable between samples.
A number of additional highly relevant data sets were also available, but since they utilize different transcriptomic platforms, they cannot be integrated easily with the Affymetrix array data sets.Alves-Ferreira et al. (2007) used Agilent two-color microarrays to analyze the effect of MS1 (At5g22260) on gene expression, comparing buds between the wild type and ms1 mutants.In this experiment, the first sample consists of stage 13 flowers, with subsequent samples containing successively younger buds, until the seventh sample contains the very earliest buds and the inflorescence meristem.Xu et al. (2010) used an in-house-printed two-color microarray to compare wild-type and ams buds at pollen mother cell meiosis, pollen mitosis I, bicellular pollen, and pollen mitosis II stages.This second, developmental time course has finer temporal and spatial detail around these key stages and therefore may be used in conjunction with the one generated from the data of Alves-Ferreira et al. (2007).
Data from these two-color microarray experiments (Alves-Ferreira et al., 2007;Xu et al., 2010) were renormalized using a method similar to that of singlecolor arrays (see "Materials and Methods"), producing two comparative developmental time courses of gene expression in wild-type, ms1, and ams buds.This enables us to use the wild-type and mutant values as a biologically meaningful developmental time course.It is noted that this method seems more prone to noise than the use of single-color microarrays, but it is particularly useful where genes are not present on the Affymetrix microarrays, such as MS1.
The resulting plots (e.g. for AMS; Fig. 1) from both the Affymetrix and two-color array data give immediate visual indication of how the genes are expressed across various samples.These plots show the individual replicates rather than means, allowing those genes whose behavior appears to be variable, possibly due to sensitivity to conditions or to the precise staging of the samples, to be easily visually detected.These plots are freely available online at http://www.cpib.ac.uk/anther, allowing quick analysis of anther expression data for genes of interest.Closed circles represent wild-type samples, and open symbols represent the different mutants.

FlowerNet Correlation Network
In order to investigate gene behavior across a wide range of samples, we generated a correlation network using 66 Arabidopsis Affymetrix wild-type microarrays from plant reproduction-related experiments, which included flowers, buds, anther/stamen samples, and isolated pollen samples (see "Materials and Methods"; Supplemental Files S1 and S2).These microarray data sets were mostly publicly available in the National Center for Biotechnology Information's Gene Expression Omnibus (Honys and Twell, 2004;Schmid et al., 2005;Mandaokar et al., 2006;Yang et al., 2007a) with some previously unpublished data (Z.A. Wilson and J. Song, unpublished data).A number of other antherrelated published studies were used for gene list overlay comparison; however, these could not be included in this network, or in the expression plots, since the raw data for these analyses were not publicly available at the time of construction.
The resulting network has 605,686 edges between 10,797 genes (Fig. 2); a visualization of the network is available as a Cytoscape file (Supplemental File S3), with a navigable list of genes that are connected to each gene presented under each gene expression plot image.
Within this network, compact highly intracorrelated clusters were identified using the TransClust algorithm (Wittkop et al., 2010) and the correlation values as an edge weight to generate small, well-connected clusters based on expression in the network (see "Materials and Methods").The resultant clusters describe groups of genes that are highly positively correlated across all the samples included, with almost all the edges between the genes included (for cluster allocations, see Supplemental File S4).These compact clusters have extremely similar expression patterns within the samples used, avoiding the problem of chain-like clusters, where genes are several neighbors removed from those at the other end of the cluster, leading to genes in the same cluster having very diverse expression patterns.The resulting clusters contain almost all of the possible edges between the nodes involved, ensuring that all the genes considered correlate well with each other.For example, the largest cluster (cluster 1) contains 14,433 edges between 171 genes, 99.3% of the possible edges; thus, these genes have very similar expression profiles.Figure 3 shows a representative cluster (cluster 110), which demonstrates how the expression profile of each gene within the cluster has a highly correlated expression pattern in the microarrays used to generate the network.
Overlaying expression data from different experiments that have not been used in the creation of the network (e.g. mutant data from the two-color microarrays detailed above) indicate that certain clusters are significantly overrepresented for these transcripts.Similarly, the majority of the clusters contain a significant overrepresentation of Gene Ontology (GO) terms, suggesting that genes within clusters are involved in the same developmental processes.A number of the clusters are enriched for general plant pathways; for example, cluster 2 is enriched in the GO term photosynthesis, cluster 11 in GA biosynthesis, and cluster 16 in water transport, although these do not show significant expression changes linked to anther/stamen-specific gene expression.Other clusters show reproductionrelated enrichment; for example, cluster 67 is a set of 14 genes that all have higher expression in the carpel (and to a lesser extent the petal) samples from the AtGenExpress Developmental Map, which have not been used in the generation of the network.To focus on anther/stamen-related clusters, we used a number of data sets to select clusters of interest, such as stamenspecific transcripts (Wellmer et al., 2004), the presence of known anther-related genes within a cluster, and clusters that are highly regulated by previously described male-sterile mutants such as ams and ms1; these are shown in Figure 4.

Clusters Associated with Pollen Wall Formation
In order to validate the association between genes identified within the same cluster, four clusters were selected for detailed investigation.These were chosen based on their expression patterns, GO annotations, and behavior in other data sets.Each of these clusters has overrepresented GO terms associated with pollen or anther development.The selected clusters show enrichment of gene expression at specific time points during pollen development based upon staged bud expression data (Xu et al., 2010); cluster 37 includes genes that show up-regulation at pollen mother cell meiosis, cluster 81 at pollen mitosis I, cluster 21 at the bicellular pollen stage, and cluster 116 at pollen mitosis II (for pollen development review, see Borg et al., 2009;Table I).This allowed us to infer a good overview of the developmental expression changes during anther and pollen development through these clusters (Fig. 5).
Figure 6 shows the expression patterns of individual genes within these particular clusters, looking at pollen-specific expression stages and the whole bud; both of these microarray data sets were used to create the network and illustrate the expected similar behavior of gene expression within each cluster.To interrogate these clusters further, we collected publicly available microarray data from various mutants known to have an important regulatory role during anther and pollen development.We chose eight mutants: ap3 (early stages only); spl/nzz and ems, which are involved in anther cell specification (Zhang et al., 2006;Alves-Ferreira et al., 2007;Wijeratne et al., 2007); dyt1 and ams (combination of all four stages); ms1 and myb80 (previously known as myb103), which are involved in tapetum development and mature pollen (Zhang et al., 2006;Alves-Ferreira et al., 2007;Xu et al., 2010;Phan et al., 2011); and rpk2 (flower stages 12-14), which is involved in anther dehiscence (Mizuno et al., 2007b).Wild-type data were used to initially generate the FlowerNet network, but when the corresponding mutant data were overlaid onto the network, a large proportion of genes within each cluster showed correlated expression in the specific mutants (Fig. 7).This clustering of expression was also shown as a heat map (Fig. 8), utilizing developmental staged microarray data from the ms1 mutant (Alves-Ferreira et al., 2007).This suggests not only that these genes are coexpressed but also that they may be coregulated.
To characterize the clusters further, we compared the gene expression of these clusters in the developmental time courses generated from the data of Xu et al. (2010) and Alves-Ferreira et al. (2007).In Figure 5, each gene within the four clusters is represented by a different color line, the thick black line being the median trend line, while Figure 8 shows a heat map of expression levels of each gene within each of the four chosen clusters.From both of these sets of data, the genes in the clusters are expressed in a similar manner and in most cases at a similar level of expression.Therefore, despite only using the wild-type Affymetrix  arrays to generate the network, the clusters show similar expression in the developmental time courses available in two-color Agilent arrays (Alves-Ferreira et al., 2007) as well as the various sets of corresponding mutant data.This implies that the genes within the clusters have expression patterns that are closely correlated, suggesting that they may be involved in the same process and potentially may be regulated by the same mechanism.Use of this network clustering to find clusters that change in different mutant backgrounds may allow new targets and processes to be inferred from the results.To focus on these coexpression similarities, we also looked at the individual genes in each cluster in more detail to compare expression and show that, within a cluster, the genes show very similar expression in both wild-type and mutant data (Fig. 9).

Cluster 21: Pollen Exine Formation
Cluster 21 contains 31 genes with 464 edges between them (99.78% of the possible edges), which are involved in pollen exine formation, and contains three characterized genes that are known to be pollen specific, Arabidopsis SUPPRESSOR OF ACTIN (AtSAC1b), ADENOSINE-59-PHOSPHOSULFATE KINASE (APK3), and VACUOLAR H + -ATPASE SUBUNIT E (VHA-E2; Despres et al., 2003;Dettmer et al., 2010;Mugford et al., 2010); 23 of these genes were found to be specifically expressed in stamens (Wellmer et al., 2004).The majority of the genes within this cluster have an unknown role, with the GO term pollen exine formation overrepresented; however, their specific function is currently undetermined.These genes are up-regulated in the stamen of stage 12 flowers (using the data from Schmid et al. [2005]) and in pollen; the majority are down-regulated in old ms1 buds, particularly in the second and third samples of the time course from Alves-Ferreira et al. (2007; sample 1 equates to mature flowers and sample 7 to immature buds/inflorescence meristem).This corresponds with a peak in the wildtype bicellular pollen development stage of buds in the data from Xu et al. (2010), with the majority of the genes within this cluster being down-regulated in the ams mutant from the same data set.The majority of these genes have higher expression in the tapetum compared with other anther tissues at pollen mitosis II (Z.A. Wilson and G. Vizcay-Barrena, unpublished data).Twenty-one of these genes were found to be downregulated in stage 5 to 8 myb80 anthers (Phan et al., 2011), suggesting that this cluster acts downstream of AMS and MYB80.
Of the three pollen-specific genes, AtSAC1b shows specific expression in pollen, with higher levels at the uninucleate stage (Despres et al., 2003), VHA-E2 is specific to the vegetative cell in the pollen (Strompen et al., 2005), while APK3 has expression throughout development, with strongest expression in pollen grains (Mugford et al., 2009).The single mutants in all of these three genes have no phenotype, most likely because of functional redundancy of other isoforms expressed within the pollen: only in the triple mutant of APK1 isoforms (apk1 apk3 apk4) is there a pollen number of criteria such as known antherrelated genes and male-sterile mutant expression.Highlighted in pink are those genes found to be stamen specific (Wellmer et al., 2004).lethality phenotype (Mugford et al., 2010).APK3 is an ADENOSINE-59-PHOSPHOSULFATE kinase involved in sulfur metabolism, while AtSAC1b has been suggested to regulate the PHOSPHOINOSITOL-4-phosphate pool and could control ATP transport (Despres et al., 2003).This links with VHA-E2, which is a vacuolar H + -ATPase  that plays an important role in maintaining the pH of endomembrane compartments in eukaryotic cells (Dettmer et al., 2010).The double and triple knockouts of AtSAC1b and VHA-E2 would be interesting to study to see if they have a similar phenotype to APK3 and discover what role these families of genes have in pollen development.Another two published genes from this cluster, AT5G15490 (UDP-GLUCOSE DEHYDROGENASE; Reboul et al., 2011) and AT3G51490 (TONOPLAST MONOSACCHARIDE TRANSPORTER1; Wormit et al., 2006), are involved in cell wall composition and sugar transport, which also may be important for pollen wall formation.
By focusing research on the other cluster members within the group, we may be able to infer further roles of this cluster, as many of the genes have predicted roles in protein movement and in secretion.It will be interesting to discover the roles they play in anther and pollen development.

Cluster 37: Sporopollenin Biosynthesis
Cluster 37 is a 20-node clique, with all edges between them present, and contains a number of genes known to be tapetum specific and linked to sporopollenin biosynthesis.Of these 20 genes, 11 were found to be stamen specific (Wellmer et al., 2004); however, two of these 20 genes are not included in the array analysis.The genes in this cluster are up-regulated in flowers at stage 9 to 11 (Schmid et al., 2005) and are mostly up-regulated in the ms1 mutant, particularly in the third to fifth samples of the time course generated from the data of Alves-Ferreira et al. (2007), while the majority of these genes are down-regulated in ams mutants, particularly at the meiosis and pollen mitosis I stages (Xu et al., 2010).This corresponds to a peak in expression in wild-type data from the same data set at the meiosis stage, with the majority of the genes more highly expressed in the tapetum than in the other anther tissues at the tetrad stage (Z.A. Wilson and G. Vizcay-Barrena, unpublished data).Collectively, this suggests that these genes are downstream of AMS and are up-regulated by AMS, while they are negatively regulated by MS1 to give a specific expression pattern.
A large proportion of these genes are involved in the biosynthesis of sporopollenin, which is a major constituent of exine in the outer pollen wall.Recently in Arabidopsis, a number of genes have been demonstrated to be involved in sporopollenin biosyn-   (Honys and Twell, 2004;Yang et al., 2007a) used to generate FlowerNet.UNM, Unicellular microspores; BCP, bicellular pollen; TCP, tricellular pollen; MPG, mature pollen grain.Young buds represent pollen mother cell mitosis I, while old buds represent mitosis II.
genes are found in this cluster.Of the studied genes within this cluster, the phenotypes range from an absence of production of pollen due to compromised pollen walls (ACOS5 and TKPR1; de Azevedo Souza et al., 2009;Tang et al., 2009;Grienenberger et al., 2010) to having irregular exine layers (CYP703A2, CYP704B1, ATP-BINDING CASSETTE G26 [ABCG26], LAP5, LAP6, and MS2; Morant et al., 2007;Dobritsa et al., 2009Dobritsa et al., , 2010;;Kim et al., 2010;Choi et al., 2011), with varying phenotypes from loss of fertility to reduced pollen viability to no effect on pollen viability.The sporopollenin precursor biosynthesis gene WBC27 (ABCG26) is also within this cluster and is involved in the transport of these sporopollenin precursors from the tapetum, facilitating exine formation on the pollen surface (Choi et al., 2011).Two other genes of unknown function are believed to play a role in pollen exine formation, At3g23770 and At4g14080 (MATERNAL EFFECT EMBRYO ARREST48), and two further unknown genes are tapetum specific, At3g4290 (TAPETUM1) and At4g20420, based on their description in AtEnsembl (Flicek et al., 2013; http://atensembl.arabidopsis.info/index.html).Therefore, 12 out of 20 genes in this cluster have a direct role in the tapetum, with one-half of the genes in this cluster involved in exine/sporopollenin formation.The rest of the genes in this cluster have an unknown function, including two putative lipid transfer proteins, At5g07230 and At5g62080, which also may be involved in sporopollenin biosynthesis.This strongly suggests that the remaining six unknown genes within this cluster could play a role in sporopollenin synthesis in some manner, and studying the knockouts of these genes could further our understanding of the sporopollenin biosynthesis pathway and transport to the pollen coat wall.

Cluster 81: Pollen Spermidine Formation
Cluster 81 is a 12-gene clique, with all 12 of these genes found to be stamen specific (Wellmer et al., 2004) and up-regulated in flower stages 9 to 11 (Schmid et al., 2005).These genes are differentially expressed (mostly down-regulated) in both ms1 and ams mutants throughout pollen developmental stages (Alves-Ferreira et al., 2007;Xu et al., 2010) as well as tapetum specific (Z.A. Wilson and G. Vizcay-Barrena, unpublished data) and particularly expressed in the pollen mitosis I stage (Xu et al., 2010).Two genes (At5g49070 and At3g52160) were also down-regulated in myb80 anthers.
Similar to cluster 37, the majority of these genes have suggestive roles in pollen exine formation (five), lipid biosynthesis (three), and lipid transfer (one), with the two studied genes in this cluster being involved in spermidine (exine) formation.Both SPERMIDINE HYDROXYCINNAMOYL TRANSFERASE (SHT) and TAPETUM-SPECIFIC METHYLTRANSFERASE1 (TMS1) are specifically expressed in tapetum cells (Fellenberg et al., 2008;Grienenberger et al., 2009).SHT mutants display irregularities in the pollen wall, with reduced autofluorescence of the pollen wall suggesting a reduction in the exine coat components.They also have a reduction of two spermidine derivatives, while in the overexpressing plants there was an increased peak for these derivatives and increased autofluorescence (Grienenberger et al., 2009).Grienenberger et al. (2009) demonstrated that SHT can catalyze the last methylation step in the biosynthesis of acylated spermidine conjugates.The TMS1 mutant showed impaired silique development (Fellenberg et al., 2008) and was shown to be involved in the biosynthesis of spermidine conjugates at a later step than SHT (Grienenberger et al., 2009).
A number of the unknown genes in this cluster also have potential roles in pollen exine and lipid biosynthesis, based on their description in AtEnsembl (Flicek et al., 2013).Therefore, it would be interesting to study the rest of the genes within this cluster to further the understanding of exine biosynthesis.
Cluster 116: Pollen Tube Formation/Growth Cluster 116 is a 10-gene clique, with seven of these genes found to be stamen specific (Wellmer et al., 2004), with the genes showing increased expression in pollen and in stamen at stages 12 and 15 (Schmid et al., 2005).These genes are down-regulated in both ms1 (samples 2 and 1) and ams (pollen mitosis II) mutants (Alves-Ferreira et al., 2007;Xu et al., 2010).The genes have highest expression in pollen mitosis II (Xu et al., 2010).One-half of the genes in this cluster are known to be pollen specific, such as At59 (Kulikauskas and McCormick, 1997), a-1 TUBULIN (Carpenter et al., 1992), and Arabidopsis PHOSPHATASE AND TENSIN HOMOLOG1 (Gupta et al., 2002), and have roles such as microtubule/filament  I) in different development stages in whole buds, based on wild-type data from Alves-Ferreira et al. (2007).Sample 1 contains stage 13 buds, and each subsequent sample comprises the next two oldest buds until sample 7, which includes the earliest buds and the inflorescence meristem.
movement, cell wall modification, and sugar transport (Schneidereit et al., 2005), so they could play roles in pollen tube growth and formation, as does CALCIUM-DEPENDENT PROTEIN KINASE24 (Zhao et al., 2013).

DISCUSSION
FlowerNet highlights clusters of genes that are similarly transcriptionally regulated and, therefore, may play closely related functional roles.Focusing on four specific clusters, we show that there is a close relationship between the genes in how they are expressed from a number of different data sets from wild-type tissues or staged samples and also in male-sterile mutant expression data sets, which were not used to make FlowerNet.This provides evidence for the robustness of this method and the network generated.Therefore, FlowerNet could play a valuable role in identifying new players in anther and pollen development, as, while the main transcription factors are known, their exact interaction and primary targets are still uncharacterized.Cluster 37 is a strong indication of how this tool can be used successfully, as 12 out of 20 genes in this cluster have a direct role in the tapetum, with one-half of the genes in this cluster known to be involved in exine/sporopollenin formation.The rest of the genes in this cluster have an unknown function, including two putative lipid transfer proteins, At5g07230 and At5g62080, which also may be involved in sporopollenin biosynthesis.This strongly suggests that the remaining unknown genes within this cluster also could play a role in sporopollenin synthesis; studying the knockouts of these genes could further our understanding of the sporopollenin biosynthesis pathway and transport to the pollen coat wall.In addition to the characterization of genes associated with anther and pollen development, the FlowerNet correlation network is also a valuable tool for the analysis of genes generally involved in flowering and plant reproduction, since the data used to generate the network have come from microarray experiments from various organs and developmental stages throughout floral development.
To aid the global dissection of anther and pollen development, we have made the gene expression plots available as a community resource (http://www.cpib.ac.uk/anther), allowing visualization of the expression across different microarray experiments that have been normalized together.Included in this resource is a list of highly correlated genes based on the FlowerNet clusters to extract further information from these plots.Additionally, the FlowerNet network is available as a Cytoscape file for specific analysis of the network and resulting clusters.These tools will be extremely helpful in the flower development community and will aid the identification of novel genes on which to focus and also in the selection of putatively redundant family members for the generation of mutants carrying multiple mutations.For example, the majority of cluster 21 has not been studied, despite the fact that many of these genes have been linked to pollen development.The mutant phenotypes that have been shown in this cluster are only from triple mutants of families, indicating the high level of redundancy of the genes within this cluster.This shows how important it is to understand how similar genes from the same family behave transcriptionally in relation to each other to further our understanding of pollen development.This clustering tool allows selective focusing on specific targets to develop our understanding of anther and pollen development.

Gene Expression Plots and Microarray Normalization
The gene expression plots consist of three separate panels of data relating to different microarray platforms.The first panel consists of Affymetrix microarrays, where the raw .celfiles were background corrected and normalized using the robust microarray averaging procedure (Irizarry et al., 2003), with a custom chip definition file from the CustomCDF project (Ath1121501_At_ TAIRG.cdfv14.0.0;Dai et al., 2005), using the Bioconductor affy package in the programming language R.This chip definition file maps the individual probes on the Affymetrix chip, using recent sequencing information contained in The Arabidopsis Information Resource (Lamesch et al., 2012), to their corresponding genes and results in 21,313 unique genes being considered.This mapping eliminates the many-many relationship that exists between the Affymetrix probe sets and gene targets as traditionally used.In particular, this bijective mapping ensures that gene Arabidopsis Genome Initiative codes may be used as the primary identifiers in a correlation network with no question of how to deal with multiple probe sets, with sometimes markedly differing behaviors, corresponding to the same gene.The resulting probe sets have varying numbers of probes with a minimum of three, although the majority of probe sets have the 11 probes from an original Affymetrix probe set.
The second panel includes the two-color microarrays described by Alves-Ferreira et al. (2007), where bud samples were collected from stage 13 through the earliest buds and the inflorescence meristem.In this data set, sample 1 consists of the pooled mature stage 13 flowers from a number of plants, with subsequent samples comprising successively younger buds per sample, until the final sample (sample 7), which consists of the remaining early-stage buds and the inflorescence meristem.These two-color microarrays consist of one channel for wild-type buds and the second channel for mutant buds (ap3, spl/ nzz, and ms1).The typical analysis for such two-color microarrays is to find lists of genes that are up-or down-regulated at each sample but not to compare across the different samples.Using the Bioconductor (Gentleman et al., 2004) package limma (Smyth and Speed, 2003;Smyth, 2005), the 33 .gprfiles were background corrected (using the normexp method; Ritchie et al., 2007), then the two channels within each array (wild type and mutant) were normalized to have similar distributions (using the median method) and then quantile normalized across the 33 arrays.This last step of quantile normalization is nonstandard for two-color arrays, discussed by Yang and Thorne (2003), but is a routine part of single-color microarray normalization and allows the comparison of fluorescence intensity between the different arrays and enabled comparisons of wild-type and mutant values as a biologically meaningful developmental time course.
The third panel includes data from an in-house-printed two-color microarray generated by Xu et al. (2010), which was used to compare between wildtype and ams buds at pollen mother cell meiosis, pollen mitosis I, bicellular pollen, and pollen mitosis II stages.These 12 .gprfiles also have been normalized in the same way as the data from Alves-Ferreira et al. (2007), allowing the comparison of genes across this time course.
Each of these three microarray platforms contains probe sets corresponding to a different set of genes, so some genes are not present in all three panels, such as the key gene MS1, which is not present on the Affymetrix chip.

Correlation Network Generation
Sixty-six wild-type Arabidopsis (Arabidopsis thaliana) Affymetrix .celfiles from both public (Honys and Twell, 2004;Schmid et al., 2005;Mandaokar et al., 2006;Yang et al., 2007) and unpublished experiments on functionally wild-type buds and anthers were used to generate a correlation network, which we term FlowerNet.These chips include 10 pollen samples, 28 whole buds, and 28 anther/stamen samples; for full details, see Supplemental File S1.A number of chips from other published experiments were unable to be included, as the raw data were not available at the time of construction.
To generate the network, the 66 raw .celfiles were background corrected and normalized using robust microarray averaging and the CustomCDF as discussed above.The genes were then filtered by keeping those probe sets that have at least one sample with expression greater than or equal to 6 (on the log 2 scale).To identify interactions between the expressed genes, the Pearson correlation coefficient between all pairs of genes was calculated.A cutoff was then applied to the resulting correlation matrix to produce a set of edges.Previous studies (Bassel et al., 2011;Dekkers et al., 2013) show that having approximately one-half million edges gives a good number from which to generate meaningful clusters.To this end, we choose a correlation of 0.88 as the cutoff, giving 605,686 edges between 10,797 genes, which is the 0.38% most significant positive correlations (Supplemental File S5); this is chosen semiarbitrarily to balance sensitivity and specificity.Each correlation above this cutoff is considered to be an edge, and a table of edges between nodes is exported into Cytoscape version 2.8.1 (Shannon et al., 2003;Smoot et al., 2011), along with the correlation value between each edge.The yGraph Organic layout was used to display the resulting networks, although such layouts should be viewed as arbitrary.
From these correlation networks, the Cytoscape plug-in ClusterMaker version 1.11 (Morris et al., 2011) was used to partition the overall network into distinct clusters.In particular, the transitivity clustering method (Wittkop et al., 2010) was used, with parameters Max Subcluster Size = 400 and Max Time = 10, using the correlation values as an edge weight to generate small, well-connected clusters in the network (Supplemental File S3).To verify that these clusters are relatively insensitive to this choice of cutoff, we generated an additional set of clusters with cutoff of 0.9; this keeps 324,977 edges.Comparing these clusters with the FlowerNet clusters, of the 92 new clusters with 10 or more genes, on average, 65.2% of the genes are in a single FlowerNet cluster.

Figure 1 .
Figure 1.Plot from the Web site tool showing the expression of AMS in various anther and floral microarrays (http://www.cpib.ac.uk/anther).

Figure 2 .
Figure 2. Cytoscape visualization of the full correlation network.Genes shown by Wellmer et al. (2004) to have higher expression in stamen are colored in purple.

Figure 3 .
Figure 3. Plot of the genes in cluster 110, showing expression in the 66 microarrays that are used to generate FlowerNet.The order of the microarrays is listed in Supplemental File S1.

Figure 4 .
Figure4.Selection of FlowerNet clusters, chosen for being bud related based on a number of criteria such as known antherrelated genes and male-sterile mutant expression.Highlighted in pink are those genes found to be stamen specific(Wellmer et al., 2004).

Figure 5 .
Figure 5. Gene expression levels in the four selected clusters in different pollen development stages in whole buds, based on wild-type data from Xu et al. (2010) for each gene within the cluster.The thick black line shows the average trend of all the genes in the cluster.

Figure 8 .
Figure 8. Gene heat map showing the levels of expression in the four selected clusters (TableI) in different development stages in whole buds, based on wild-type data fromAlves-Ferreira et al. (2007).Sample 1 contains stage 13 buds, and each subsequent sample comprises the next two oldest buds until sample 7, which includes the earliest buds and the inflorescence meristem.

Table I .
Genes present in the four selected clusters including their predicted roles and published expression patterns

Table I .
(Continued from previous page.)