Enlarging cells initiating apomixis in Hieracium praealtum transition to an embryo sac program prior to entering mitosis.

Hieracium praealtum forms seeds asexually by apomixis. During ovule development, sexual reproduction initiates with megaspore mother cell entry into meiosis and formation of a tetrad of haploid megaspores. The sexual pathway ceases when a diploid aposporous initial (AI) cell differentiates, enlarges, and undergoes mitosis, forming an aposporous embryo sac that displaces sexual structures. Embryo and endosperm development in aposporous embryo sacs is fertilization independent. Transcriptional data relating to apomixis initiation in Hieracium spp. ovules is scarce and the functional identity of the AI cell relative to other ovule cell types is unclear. Enlarging AI cells with undivided nuclei, early aposporous embryo sacs containing two to four nuclei, and random groups of sporophytic ovule cells not undergoing these events were collected by laser capture microdissection. Isolated amplified messenger RNA samples were sequenced using the 454 pyrosequencing platform and comparatively analyzed to establish indicative roles of the captured cell types. Transcriptome and protein motif analyses showed that approximately one-half of the assembled contigs identified homologous sequences in Arabidopsis (Arabidopsis thaliana), of which the vast majority were expressed during early Arabidopsis ovule development. The sporophytic ovule cells were enriched in signaling functions. Gene expression indicative of meiosis was notably absent in enlarging AI cells, consistent with subsequent aposporous embryo sac formation without meiosis. The AI cell transcriptome was most similar to the early aposporous embryo sac transcriptome when comparing known functional annotations and both shared expressed genes involved in gametophyte development, suggesting that the enlarging AI cell is already transitioning to an embryo sac program prior to mitotic division.


INTRODUCTION
Some Hieracium subgenus Pilosella species form seed via sexual reproduction.
Others are facultative for apomixis where the majority of seed is formed via an asexual pathway and therefore genetically identical, while a small proportion of seed is derived via sexual reproduction. Female gametophyte (or embryo sac) development in ovules of sexual Hieracium species occurs via the most common pathway observed in angiosperms (Drews and Koltunow, 2011). It initiates with megasporogenesis, a process requiring diploid megaspore mother cell (MMC) differentiation and subsequent MMC meiosis to produce a tetrad of four haploid megaspores. Three of these megaspores undergo cell death. The surviving or functional megaspore (FM) undergoes megagametogenesis, characterized by three rounds of syncytial nuclear mitosis, followed by cellularization and differentiation to produce the mature female gametophyte. Six cells in the female gametophyte contain a haploid nucleus including the egg cell, two synergids and three antipodal cells while the central cell contains two haploid nuclei that fuse prior to double fertilization. Fertilization of the haploid egg and the diploid central cell in the female gametophyte by haploid male sperm cells triggers formation of the embryo and endosperm compartments of the seed respectively (Figure 1A, yellow;Koltunow et al., 1998).
In apomictic Hieracium subgenus Pilosella species, the MMC initiates and completes meiosis as observed in sexual species. The meiotic events of megasporogenesis are essential for apomixis initiation in H. piloselloides and are thought to activate function expressed in the MMC of sexual and apomictic Hieracium but not in AI cells (Tucker et al., 2003). The DMC1 gene, which is required for interhomolog recombination during meiosis (Couteau et al., 1999) is expressed in the MMCs of sexual and apomictic Hieracium but is undetectable in AI cells via in situ hybridization (Okada et al., 2007). While these data imply the AI cell is unlikely to have MMC identity, and more likely to differentiate with functional megaspore identity, the expression of other meiosis genes has not been examined. Thus the possibility that the meiotic pathway may initiate and deviate at other stages in developing AI cells cannot currently be excluded. The possibility that Hieracium AI cells are functional megaspores is also unresolved due to the limited availability of megaspore-specific markers to test this. The Arabidopsis functional megaspore marker pFM1 (Acosta-Garcia and Vielle-Calzada, 2004) has been introduced to sexual and apomictic Hieracium but is not expressed in ovules (Koltunow et al., 2011a;.
Collectively, these analyses highlight the current paucity of cell-type specific markers and ovule EST sequence information pertaining to early apomictic development in aposporous Hieracium.
In this study, we used laser capture microdissection ( LC M ) and 454 pyrosequencing of the isolated and amplified RNA to examine gene expression in enlarging AI cells and EAE sacs, in comparison with surrounding ovule cells during apomictic initiation in H. praealtum. Indicative roles of each cell type were established through comparisons of expressed sequences across all three cell types, and analyses of sequence annotations derived through homology to known genes and protein motifs.
These analyses have revealed close functional similarity between AI cells and EAE sacs, and significant enrichment of signaling functions in surrounding sporophytic ovule cells which may impact on apomixis initiation and development in aposporous H. praealtum. these, showing homology to an abscisic acid induced dehydration responsive RD22-like gene (clone 09.45 RD22, Table S2), a CC-NBS-LRR-like disease resistance gene (clone 24.04 NLR, Table S2) and a putative lipoxygenase-like gene involved in jasmonic acid synthesis (clone 27.18 LOX, Table S2) were observed to be up-regulated in the AI cell by RT-PCR ( Figure 1C, class II; Table S2). When quantitative real-time PCR was used to examine expression of the three AI cell expressed genes and three others in the aRNA samples, expression patterns were consistent with the cell-type enrichment patterns observed in RT-PCR, with low level expression of the three AI cell enriched genes detected in SO cell and EAE sac samples ( Figure S2). o t h e r t w o g e n e s a r e s h o w n i n F i g u r e S 3 . Transcripts were difficult to detect because of low expression levels. However, transcripts were found in enlarged, uninucleate AI cells, in degenerating megaspores, degenerating nucellar epidermal cells and EAE sacs. By contrast, transcripts from these three genes were not detectable by in situ hybridization in sexual H. pilosella ovules undergoing the events of megasporogenesis. In situ results for two of these transcripts are shown in Figure S3. This suggests that these three genes are up-regulated in a small subset of ovule cell types undergoing apomixis initiation and sexual suppression in the apomict.
Since the aRNA generated from the three H. praealtum laser-captured ovule cell types retained a majority of the tested low copy ovule genes, the samples were further processed for 454 pyrosequencing to compare expression profiles in each cell type and to explore the functional identity of the AI cell. The identified set of low level Hieracium ovule sequences, with known expression patterns in the three aRNA cell types served as useful internal controls to gauge the efficacy of transcript sequencing depth and assembly.

0
Hieracium subgenus Pilosella species currently lack a reference genome or substantial DNA or EST public sequence resources, and therefore expression profiling requires the use of de novo transcriptome assembly and characterization approaches.
Relative to other high-throughput sequencing technologies, such as Illumina sequencing, 454 pyrosequencing technology generates sequence reads on average 2.5 times longer which better facilitates de novo assembly, however the total read count is lower. In total, 465,191 high quality sequence reads with a median read length of 251 bases were obtained from the AI, SO cells and EAE sac samples (Table I). A de novo transcriptome characterization strategy, encompassing three complementary in silico approaches ( Figure   S4), was used to make qualitative comparisons between the cell-type transcriptomes and to infer distinctive functional features of the three H. praealtum ovule cell types. The first in silico approach explored the sequence overlap between the three cell-type transcriptomes irrespective of similarity to known genes or functional annotations. This approach required the assembly of sequence reads into cell-type contig sets as described below. The second and third approaches mapped expressed sequences or contigs to known protein domain and gene annotations, and contrasted cell-type transcriptomes in pairwise comparisons of these annotations.
To compare the expressed sequence complement of three cell types, four high quality sequence datasets were independently assembled using the MIRA algorithm (Chevreux et al., 2004; Figure S4), one for each cell type in addition to a combined set of all three cell-type datasets (Table II). A total of 8,044 sequence contigs were assembled for captured SO cells, 8,780 for AI cells, 5,002 for EAE sacs and 18,219 for the combined assembly with median lengths of 403 to 474 bases across all four assemblies.
The number of contigs generated in the combined assembly resulted in a total number of distinct contigs 16% less than the sum of the three cell-type contig sets with little gain in median contig length. This result may suggest substantial sequence diversity in these polyploid transcriptomes, but may also indicate insufficient continuity of coverage across transcript lengths to achieve longer transcript assembly. The set of non-redundant contigs arising from the combined assembly was used as a consensus or point of reference for sequence comparison between the three cell-type contig sets ( Figure 2A).

1
The utility and relevance of the assembled contig sets were assessed by comparison to characterized sequences expressed in the H. praealtum ovary, and through annotation by sequence homology to public sequence databases. Of the low copy ovule sequences detected by RT-PCR in RNA amplified from each cell type ( Figure 1C; Table S2), between 75 to 79% of sequences were identified in the assembled cell-type contigs (Table II). The three genes shown by in situ to be enriched in AI cells, degenerating megaspores and nucellar epidermal cells ( Figure 1D-G; Figure S3) were not detected in the assembled AI cell transcriptome. This may relate to the depth of the 454 sequence data set such that these transcripts are below the limit of detection. In addition, the enrichment of 3'end sequences in our transcriptome due to the RNA amplification process as found by others (Wuest et al., 2010) may hamper the identification of these sequences in the sequence read data if the characterized transcript sequences are not full length.
The number of unique contig sequences observed for the EAE sac sample was approximately 30% less than both AI cell and SO cell samples (Table II). Despite this, the ovary-enriched, low-abundance transcripts were observed in similar proportions across all cell types including the EAE sac contig set (Table II). These observations suggest that the three LCM transcriptome datasets have comparable sequence coverage. However, until a comprehensive de novo characterization of these cell-specific transcriptomes is made possible with deeper sequence coverage, a difference in transcriptome diversity and coverage cannot be excluded. Lack of sequence depth in this dataset enforces a technical boundary on the transcripts that can be observed. However, the detection of 75% of the tested low abundance ovule transcripts in the aRNA sample, and the presence of the majority of these in the assembled contig sequences suggest that the dataset has a range of detection that can be used to make qualitative comparisons between these previously unstudied cell-type transcriptomes.
The majority of contigs generated high-quality matches through blast sequence alignments to the NCBI non-redundant proteins (nr), Arabidopsis TAIR10 peptide and 1 2 generated for 30-50% sequence contigs across the cell types, and of the TAIR10 alignments more than 96% of annotated contigs (36% of all contigs) could be mapped to a GO term (Table III). The unannotated remainder may contain novel transcripts unique to Hieracium, incorrectly assembled contigs, or contigs lacking sufficiently long stretches of coding sequence to derive high-scoring cross-genome alignments.
Although the aim of this study was qualitative comparison and de novo characterization and not quantitative profiling of transcripts, sequence read number per contig was compared to estimates of transcript abundance generated by Q-PCR for 15 randomly selected contig sequences, 5 contigs from each of the three cell-type assemblies.
For these tested candidates, the average correlation between read counts and Q-PCR for each cell type was high (R ≥ 0.9; Table II; Figure S5).
Collectively, these results suggest that the assembled contig sets generated from 454 sequencing of Hieracium LCM-derived samples contain cell-type specific sequences corresponding to known transcripts from other databases, in addition to unknown and unannotated sequences.

Sequence similarity and overlap in pairwise comparisons of cell-type contigs
The first in silico approach explored sequence similarity between the three assembled cell-type transcriptomes using the total combined contig set (Table II) as a point of reference. Sequence similarity was identified through blastn analysis (E-value 1 3 AI and SO cell overlap compared with 11.8% (803) that were specific to the AI and EAE sac overlap. SO cells also shared more contigs uniquely with AI cells (24.4%) than with EAE sacs (8.4%). From the perspective of the EAE sac transcriptome, there were slightly more contigs unique to the overlap between EAE sacs and AI cells (803) than there were unique to the overlap between EAE sacs and SO cells (527; Figure 2A). Thus, in terms of similarity of expressed sequences, the EAE sac transcriptome bore greatest similarity to the AI cell, while the AI and SO cell transcriptomes shared greater sequence overlap than either of their pairwise sequence comparisons to EAE sacs.

Protein domain annotations of unassembled sequences show cell type specific enrichment in signaling and protein metabolism
The second in silico approach investigated each cell type for specific protein domain signatures that may reflect distinctive functional attributes. We analyzed the set of high quality unassembled sequences from each cell type using Pfam, the protein domain sequence database resource (Punta et al., 2012). Reads from the SO cells, AI cell and EAE sac samples could be mapped to 1,570, 1,552 and 981 Pfam domains respectively (Table I).
Significantly enriched Pfam domains for each cell type were identified in pairwise contrasts between the three cell types ( Figure S4) and annotated for gene ontology (GO) categories. Table S3 shows the set of significantly enriched Pfam domain annotations in each cell type with associated GO terms.
In the context of enriched Pfam domains with GO annotations, the greatest distinctions could be found in comparisons between EAE sacs and SO cells with a total of 23 Pfam domains showing differential frequencies between these two cell types. Of the 15 domains that show enrichment in EAE sacs, nine represent either small or large subunits of the ribosomal complex (Table S3), and as such, all are associated with the GO molecular function term of structural constituent of ribosome (GO:0003735). Most other domains found enriched in the EAE sac data relative to SO could be grouped under the gene ontology parent terms of hydrolase activity (GO:0016787) or transition metal ion binding (GO:0046914), with the latter enriched in cytochrome P450 (iron binding) and plastocyanin-like (copper ion binding) domains. Each of these domains has been identified 1 4 as an important catalytic component in a broad range of physiological, developmental and signaling pathways in plants.

Comparison of the AI cell with SO cells showed enrichment of ribosomal protein
Pfam domains in the AI cell, similar to that found in the EAE sacs relative to SO cells.
However, the AI cell was also enriched for Pfam domains implicated in ubiquitindependent protein degradation. The ubiquitin proteasome protein catabolic complex is involved in diverse developmental processes including regulation of auxin (reviewed in Vierstra, 2012) and jasmonate signaling (Xie et al., 1998) (Table S3).
The AI cell and EAE sac Pfam domain comparisons showed the least differences implying a greater functional similarity. The majority of domains observed to be enriched in the SO cell sequence set relative to EAE sacs were similarly enriched in SO cells relative to the AI cell, also supporting closer functional association between AI cells and EAE sacs relative to SO cells. The only domain showing statistically significant enrichment in AI cells relative to EAE sacs was the WD40 protein domain. This domain has been implicated in female gametophyte development (Shi et al., 2005) and is known to function in creating protein scaffolds and facilitating protein interactions in multi-protein complexes such as the E3 ligase complex (Smith et al., 1999). Pectinesterase, cysteine protease and the profilin domain were enriched in EAE sacs relative to the AI cell sequence set which may relate to events of embryo sac growth and expansion. The profilin domain is implicated in actin binding, commensurate with the abundance of actin cytoskeletons in two and fournucleate embryo sacs during megagametogenesis (Webb and Gunning 1994).
Collectively, analysis of protein domains that could be annotated in unassembled 1 5 reads from the three captured cell types, suggested that the SO cell transcriptome was functionally distinct from t h o s e o f the AI and EAE sac predominately in signaling and ribosome and ubiquitin related domains, while highlighting substantial functional overlap between the AI cell and dividing EAE sacs.

Annotation of assembled contigs shows closest functional identity exists between AI cells and EAE sacs
In a third in silico approach, we associated contigs to GO annotations through sequence homology and sought to identify GO terms that showed specific cell-type enrichment using pairwise comparisons. All contig sets showed comparable GO annotation rates ranging from 65% to 74% (Table III). GO enrichment analysis was completed in two complementary stages; the first was a commonly used singular GO term enrichment analysis (SEA) utilizing a Fisher exact test for increased frequencies of individual terms in an input list relative to a background list (Ashburner et al., 2000;Du et al., 2010). While a commonly used and powerful approach, SEA GO analyses consider individual GO term frequencies in isolation against the full GO hierarchy, and therefore can struggle to identify statistically significant evidence of enrichment for more specific, but less frequently observed, child t e r m s . To address this, nested GO analysis (nEASE) was also used (Chittenden et al., 2012). This approach restricts testing t o w i t h i n s i g n i f i c a n t t e r m s found in the first stage (SEA) and uses related GO terms and similarly annotated genes to better discriminate enrichments at more specific GO terms. These approaches have been shown to provide greater sensitivity to detect biologically relevant functional themes in human cancer expression profiling (Zhang et al., 2010;Chittenden et al., 2012). The full results of the GO enrichment analysis from both singular and nested analyses are presented in Table S4.
Interestingly, there were no statistically significant (P≤0.05) differences in GO term counts identified in the reciprocal pairwise comparison between AI cell and EAE sac derived annotations. This implied close functional identity between these two cell types (Table S4), but does not necessarily mean that the same genes are expressed in both samples, and clearly all cell-type contig sets contain subsets of unique expressed sequences 1 6 ( Figure 2A). The subsets of contig sequences unique to AI and EAE sacs were further explored, and annotation rates and GO results were concordant with comparisons of the full cell-type contig sets. Therefore these unique sequences may represent different sequence mappings to similar gene sets, poorly annotated sequences or species-specific sequences.
The pairwise comparison of EAE sac and SO cell contig annotations yielded the greatest number of discriminatory GO terms (Table S4). These findings were congruent with the prior Pfam annotation analysis of unassembled sequence reads. The enriched GO annotations of EAE sac contigs relative to SO cells are dominated by terms related to gametophyte development, lipid localization, ribosome biogenesis, translation and gene expression, overlapping with many of the annotated Pfam domains. The functional terms enriched in AI cells relative to SO cells also shared substantial overlap with those enriched in EAE sacs relative to SO cells, however the biological process of flower development was uniquely enriched in AI cells relative to SO cells (TableS4).
GO enrichment terms in SO cells relative to EAE sacs centered around functional themes of signaling, protein kinase activity and phosphotransferase activity, transcription factors and nucleic acid metabolism (Table S4). Similar functional themes were enriched in SO cells relative to the AI cell annotations along with increased annotations related to glycosyl hydrolysis and methyltransferase.
In summary, the pairwise comparisons between assembled contig sequences from each cell type, made on the basis of sequence similarity alone, indicated that while the EAE sac-expressed contigs shared greater similarity to the AI cell than SO-expressed contigs, the AI cell shared best overlap with SO cell-expressed contigs. In complementary annotation-based analyses that identify functional similarity (derived GO annotations), very few discriminatory GO terms from either gene or Pfam domain annotations were identified between AI cell and EAE sac. Derived GO annotations from contigs and Pfam domain annotations of unassembled reads both suggest relative enrichment of signaling pathways in SO cells, and overlapping enrichment of ribosome biogenesis, translation and gametophyte development in the AI and EAE sac transcriptomes. The Pfam analysis more clearly 1 8 Genes that were differentially expressed between tissue types in Arabidopsis and could be mapped to a Hieracium contig were also examined (Table III; Figure S7).
Genes meeting these criteria were more likely to be differentially expressed between the FG2-4 and whole ovule sample, compared to the nucellus and its corresponding whole ovule sample (Table III; Figure S7). Many of these genes encoded ribosomal subunits.
Analyses of the expression of ribosomal genes in Arabidopsis ovules showed that most ribosomal genes are decreased in abundance in the Arabidopsis nucellus relative to other tissues, but ribosomes become more abundant in the female gametophyte at FG2-4, during early mitotic events ( Figure 2B). Thus, the enrichment of ribosomal genes in cell types undergoing female gametophyte development in sexual Arabidopsis parallels the enrichment of ribosomal genes in AI cells and EAE sacs in Hieracium relative to other ovule cell types during early aposporous embryo sac development.
We next queried the expression of ubiquitin-associated genes in the Arabidopsis nucellus relative to other Arabidopsis ovule tissues. Figure 2B shows that ubiquitinassociated genes are slightly more enriched in the nucellus compared with other ovule tissues ( Figure 2B). Reciprocal best blast analyses between the 329 probes associated with ubiquitin processes on the Arabidopsis array and Hieracium AI cell, SO cell and EAE sac contigs show that 16 are found only in the AI cell relative to eight in the SO cell and four in EAE sacs. Genes expressed in the AI cell only category were RUB1 and SKP1-like and included some ubiquitin protein ligases, which is consistent with the enriched ubiquitinassociated domains found in the Pfam analyses (Table S3). However, in apomictic Hieracium, ubiquitin-associated sequences are enriched in both AI cells and EAE sacs (Table S3; Figure 3).
Taken together, these comparisons of gene expression in ovules of sexual Arabidopsis and apomictic Hieracium at comparable stages of gametophyte development show high similarity in sequence identity of expressed genes. However, their differential behavior in the tissue sets is not absolutely conserved. This may be due to a combination of differences in the evolutionary distance between the two species, ploidy of the embryo sac structures (ie. the diploid apomictic Hieracium EAE sac versus the haploid meiotically derived Arabidopsis FG), differences in the developmental stage of the tissues collected, 1 9 and possibly heterochronic ovule gene expression relating to aposporous embryo sac growth.

Meiosis and megaspore gene homologs are not found in enlarging AI cells
We utilized the significant annotation overlaps between Arabidopsis and Hieracium to directly query datasets for putative homologs in each captured sample to further investigate indicative AI cell functions. First we queried whether the AI transcriptome bears similarity to that of the megaspore mother cell (MMC), or functional megaspore (FM; Fig. 1A). Annotation of the AI cell and EAE sac contigs did not yield any GO annotations related to meiosis function. We directly queried the contig set and the unassembled read set for currently known genes characteristic of, and/or required for were not observed in the AI cell transcriptome (see Table S5 for Arabidopsis gene identifiers). As sequence coverage in this study is not saturating, we investigated the possibility that these meiosis-associated genes are expressed in the ovule at levels below the range of detection of expression of this dataset, it would be expected that sequencing depth is sufficient to detect at least the 7 conserved candidates with robust transcript abundance in the Arabidopsis ovule. The absence of any of the known transcripts required for meiosis and functional megaspore function in the AI cell suggests it is unlikely to be undergoing either a meiotic or functional megaspore developmental program.

Unique developmental regulators are expressed in developing AI cells as revealed by GO analysis
In order to interrogate the possible molecular functions of the AI cell further we examined more closely the Arabidopsis genes identified by sequence homology found to be differentially present in the three cell types. To guide this investigation we focused on genes associated with the statistically significant GO terms uncovered by the nested GO analysis in pairwise comparisons involving the AI cell (Table S4). This analysis had identified six GO terms in the AI and SO cell pairwise comparison, and none in the AI and EAE sac comparison (Table S4). SO enriched terms included carbohydrate metabolic process and methyltransferase activity, while AI enriched terms included flower development, gametophyte development, multi-organism process and lipid localization.
Notably, lipid localization, multi-organism process and gametophyte development was also enriched in EAE sacs relative to SO cell by nested GO analysis (Table S4). The list of genes underlying these terms was filtered to remove the genes identified in both AI and SO cell transcriptomes, leaving the annotations specific to the AI, EAE sacs or SO cell type.
These annotations were also queried for evidence of expression in Arabidopsis early ovule tissues as assessed by the 44k array (Table S6, (Table S6; Pagnussat et al., 2005). Rho GTPase-like genes were found and these act as molecular switches that control cytoskeletal dynamics and influence pollen tube tip growth and animal cell movement in spatial cell zones. Their expression may suggest potential involvement in the directional growth of AI cells toward sexual cells during sexual suppression (Kenneth and Duckett, 2012; Table S6).  ; Table S6) was detected in AI cells and two to four nucleate EAE sacs. Transcripts of this gene are not detected during mitosis in Arabidopsis embryo sacs. Ectopic expression of EOSTRE results in abnormal nuclear migrations and one of the synergids is converted to a second egg cell in 10-15% of cases (Pagnussat et al., 2007). Hieracium aposporous embryo sacs also exhibit abnormal nuclear migration and conversion of synergids to eggs, leading to corresponding frequencies of polyembryonic seed formation (Koltunow et al., 1998;Koltunow et al., 2000). This may reflect a mis-expression of this gene during apospory, however, expression comparisons with sexual Hieracium embryo sacs are needed to confirm this.

Both AI cells and EAE
In general, this expanded in silico analysis of genes underlying terms found through nested GO enrichment highlights the similarity in embryo sac-like programs between AI cells, EAE sacs and Arabidopsis embryo sacs (Table S6), and reveals potential ectopic gene expression and/or possible gene recruitments that may influence features of AI cell gene expression and fate.

Stress and disease resistance-like gene expression in AI cells and EAE sacs
AI cells and EAE sacs also appear to exhibit expression of homologs of genes not evident in Arabidopsis embryo sacs at two to four nucleate stages from array analyses www.plantphysiol.org on August 16, 2017 -Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved. (Table S6). These include, CDC2-like, ABA stress-inducible genes thought to function in autophagy, a CONSTITUTIVE DISEASE RESISTANCE1-like homolog, and pathogenesis associated lipid transfer proteins (Table S6). Homologs of other genes involved in responses to salt, ozone, other abiotic stresses and pathogen infection were evident in AI cells and/or EAE sacs (Table S6). The significance of this is unclear and may reflect involvement in aspects of Hieracium embryo sac growth. However, homologs of three genes that fit into this "stress-pathogen" category include an abscisic acid induced RD22like gene, a CC-NBS-LRR-like resistance gene and a putative lipoxygenase-like gene, coordinately up-regulated in the AI cell, degenerating nucellar epidermal cells and megaspores in apomictic ovules, and undetectable by in situ in sexual ovules, suggesting association with apomictic initiation and sexual suppression ( Figure 1D-G; Figure S3).
The functional associations of such stress and pathogenesis-related genes in apomictic events warrant further investigation. florets or microdissected ovules and analyses involving differential screening and comparative cDNA sequencing (Vielle-Calzada et al., 1996;Tucker et al., 2001;Rodrigues et al., 2003;Albertini et al., 2004;Singh et al., 2007;Cervigni et al., 2008;Laspina et al., 2008). More recently, high throughput RNA sequencing technologies have been employed (Sharbel et al., 2009;Sharbel et al., 2010). Deep sequencing analyses in microdissected diploid sexual and apomictic Boechera ovules, which undergo diplospory, have identified a down-regulation of gene expression in apomictic ovules relative to sexual ovules around the time of MMC development and its switch to mitotic embryo sac formation. However, there was no obvious developmental pathway or timing change that could simply explain the shift to apomixis. Transcription factors were overrepresented among apomixis-specific genes, suggestive of large-scale regulatory changes in apomictic ovules (Sharbel et al., 2010;Hofmann, 2010).

Apomixis control and transcriptional analyses of apomixis events
Depending on the aposporous species, timing of AI cell development may occur at various stages in relation to the temporal sequence of sexual events. In some species the sexual pathway persists even though aposporous embryo sacs form (Koltunow and Grossniklaus, 2003). We were unable to find obvious commonality of gene expression categories in our Hieracium cell-type transcriptomes compared to the available transcriptome information from aposporous grass species, Poa pratensis, Paspalum notatum and Pennisetum glaucum (Albertini et al., 2004;Laspina et al., 2008;data not shown). This may reflect sequencing depth and associated limitations in the ability to resolve differential expression in AI cells, or unreduced female gametophytes in the complex starting material employed in these studies. Apospory may also involve changes in a subset of commonly expressed genes in sporophytic and gametophytic cells whose action is reflected at the post-transcriptional or protein level. Alternatively, apospory in grass species and eudicot Hieracium may result from different molecular mechanisms.
While studies in whole ovules may indicate association of candidate pathways with apomictic reproduction, the cells involved in the process form a small percentage of the total ovule cell mass, spatial validation by in situ does not feature in some studies, and functional validation is limited by the ability to transform the species. Isolation of specific cell types enables direct transcriptional comparisons (Kerk et al., 2003;Day et al., 2005). Here we have confirmed the efficacy of LCM in combination with 454 pyrosequencing, bioinformatic analyses and in situ hybridization to explore transcriptomes of two apomictic cell types, the aposporous initial (AI) cell and the early aposporous embryo (EAE) sac of Hieracium praealtum in relation to somatic ovule (SO) cells not participating in these events. We also compared our expression data with various ovule cell types isolated by laser capture from Arabidopsis.

The enlarging AI cell is transitioning to a mitotic embryo sac program prior to nuclear division
While the transcriptome sequence datasets obtained from the laser-captured Hieracium cell types were not saturating for quantitative analyses, low-abundance and cellspecific transcripts and protein domains were identified in conjunction with a range of putative homologs found in ovules of other species. The presence of low-abundance transcripts and overlaps with putative ovule-expressed homologs suggests that this dataset can provide useful insights into discriminating cellular functions of the enlarging AI cell.
Gene ontology categories enriched in each examined cell type are summarized in Figure 3.
These enrichments were identified through pairwise comparison of the 3 cell types (AI to SO cell, AI to EAE sac and EAE sac to SO cell) and showed that the AI cell transcriptome displays a similar functional profile to the EAE sac transcriptome.
The AI cell transcriptome does not exhibit expression of meiosis-associated genes that are conserved in many plant species. Relative to SO cells and EAE sacs the AI cell is  (Nonomura et al., 2003, Nonomura et al., 2007Zhao et al., 2008;Garcia-Aguilar et al., 2010;Olmedo-Monfil et al., 2010). Disruption of sporophytic ARGONAUTE function in maize can lead to a change in MMC cell fate such that it bypasses meiosis and undergoes mitosis forming a diploid embryo sac (Singh et al., 2011).
Recent data suggest that integrity of small RNA pathways in sporophytic ovule tissues is also important for sequential progression between megasporogenesis and megagametogenesis in Arabidopsis (Tucker et al., 2012a).

Roles of ubiquitin proteasome pathways in meiotic avoidance and apomixis
The anaphase-promoting complex/cyclosome (APC/C) is an evolutionarily conserved E3 ubiquitin ligase critical for cell cycle progression by degrading cell cycle proteins. The enrichment of ubiquitin proteasome components in AI cells observed in this study during their growth and transition to mitotic events of embryo sac formation is in keeping with its function in mitotic cell types. The Arabidopsis protein, OMISSION OF THE SECOND DIVISION (OSD1) is associated with and negatively regulates the APC/C. OSD1 functions in both divisions of meiosis and, interestingly, loss of its function leads to omission of the second meiotic division. In osd1cyclin A1;2 double mutants, the first and second meiotic divisions are avoided (d'Erfurth et al., 2009;d'Erfurth et al., 2010).
Genomic sequences tightly linked to the apospory locus HAPPY (for Hypericum APOSPORY) in tetraploid Hypericum perforatum (St John's wort) contain a truncated gene (HpARI) homologous to Arabidopsis ARIADNE7, which encodes a ringfinger E3 ligase protein predicted to be involved in various regulatory processes related to ubiquitinmediated protein degradation. The HpARI marker co-segregates with apospory but not autonomous embryo formation and is inherited in a dominant manner in aposporous segregants. Three intact "sexual alleles" are also present in the tetraploid apomict and these www.plantphysiol.org on August 16, 2017 -Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved. 7 are co-expressed with the truncated HpARI gene in a variety of plant tissues. HpARI is proposed to act in a dominant negative manner at the protein level to influence alterations in gametophyte development (Schallau et al., 2010). Cysteine-rich ring domains characteristic of the ARIADNE family are found in contigs from all three Hieracium lasercaptured cell types and merit further characterization.
A range of gene homologs up-regulated in AI cells and EAE sacs relative to SO cells are involved in processes regulated by the ubiquitin proteasome pathway including auxin and jasmonate signaling, flower development, R-gene mediated pathogen resistance, abiotic stress (drought), cell cycle progression and gametophyte development. Thus it is conceivable that the LOA locus may influence this pathway in apomictic Hieracium.

CONCLUSIONS
Laser capture microdissection in this study has enabled an analysis of the indicative functions of specific ovule cell types at early stages of aposporous embryo sac formation in apomictic Hieracium. We have determined that enlarging AI cells appear to be transitioning to a gametophytic program prior to their first nuclear division. Future examination of transcriptional profiles of laser-captured cell types from sexual, apomictic and mutant Hieracium that have lost LOA and/or LOP function, in conjunction with whole ovule transcriptome analysis from the same material using the depth afforded by the Illumina short read platform should provide a comprehensive, quantitative analysis of large and small RNA pathways operating during apomictic reproduction.
Ovaries were dissected from the fixed florets in 70% ethanol and further dehydrated to 100% ethanol. Tissue was infiltrated at 4 ºC with a butyl methyl methacrylate (BMM) solution (79.5% n-butyl methacrylate, 20% methyl methacrylate, 0.5% benzoin methyl ether from ProSciTech, Kirwan, Queensland, Australia with 1mM DTT) and ethanol in the following volume ratios 1:3, 1:1, 3:1, 1:0, 1:0, with gentle agitation for 12 hours each change (Baskin et al., 1992). Individual ova ri es were placed in B EEM capsules (ProSciTech, Kirwan, Queensland, Australia) and polymerized at -20 ºC under UV light (6W lamp) for 3-5 days. Serial ovule sections, 5 µm thick, were cut using glass knives with a rotary microtome (Model 2055, Leica Microsystems, Wetzlar, Germany), floated on sterile water on a Leica PEN membrane coated microscope slide and dried at 42 ºC prior to long-term storage at 4 ºC. BMM was removed by slide immersion in 100% acetone for 10 minutes and drying at 42 ºC prior to LCM. Cells were dissected from sections using a Leica AS Laser Microdissection system (Leica Microsystems, Wetzlar, Germany) equipped with a 63x objective at aperture 5-6, and a UV laser (337 nm wavelength) at intensity 40-46 and tracking speed 3-4. AI cells with undivided nuclei were captured from 270 individual 5µm sections (see Figure S1). A total of 100 early aposporous embryo (EAE) sacs containing two to four nuclei were also harvested from 5µm sections. Clusters of approximately 50 sporophytic ovule (SO) cells were cut from 50 individual 5µm ovule sections after AI cells and EAE sacs had been harvested. Harvested cells were collected in the cap of a 0.2 ml PCR tube and the RNA was either isolated immediately or the captured cells were stored at -80 ºC.

RNA isolation and amplification from LCM sections
Total RNA was extracted from the captured cell types using a PicoPure™ RNA isolation kit (Arcturus Bioscience Inc, Mountain View, CA, USA) according to the 1 Raw 454 sequences were trimmed for adapter sequences, low quality and ambiguous bases using the Lucy algorithm version 1.2.0 (Chou and Holmes, 2001) with default parameters. Trimmed sequences less than 100 bases were discarded. In total, 79.9% of raw sequences passed through quality filtering for use in assembly. All quality filtered sequence reads were used to query the protein family database, Pfam (http://pfam.sanger.ac.uk) to explore protein domain annotations predicted in the transcriptome sequence. The use of the larger set of sequence reads before assembly provided more direct access to individual read count information, and also provided a means for analysis independent of the assembly algorithm. Sequence reads were translated in 6-frames and putative peptides longer than 20 amino acids were analyzed against Pfam using a blastx E-value threshold of 1E-5. Where multiple acceptable matches were found, each sequence was annotated by the best scoring Pfam domain. Pfam domains observed in at least 5 reads were annotated for GO functional classes through the Pfam database, and analyzed for enrichment in pairwise comparisons between the three cell types using the Fisher exact test (P value ≤ 0.05) with FDR correction for multiple testing (Benjamini and Hochberg, 1995).

Contig assembly and gene annotation
The Mira algorithm version 3.2.0 (Chevreux et al., 2004) was used for assembly with the "accurate" parameter setting. The assembly resulted in 8,044 contigs for SO cells, 8,780 for AI cells and 5,002 for EAE sacs. To generate the combined assembly, quality filtered sequences from the three samples were combined and assembled resulting in 18,219 contigs. These were further filtered to those that had an average read coverage across their length of at least 3 reads. Contig sequences from the combined assembly were compared against each cell-type assembly and sequences were considered matched if they had a blastn hit (E-value to those with E-value ≤ 1E-5) and at a match extending at least 80% of the query or target An EASE Arabidopsis ontology reference dataset was created for this purpose from the GO annotations available at TAIR10.

Normalization of sequence reads to number of the contigs and validation by Q-PCR
To compare abundance of contig sequences between cell types and for Q-PCR validation, the number of sequence reads attributable to each contig was normalized to

SUPPLEMENTARY MATERIAL
Supplementary Protocol 1: Laser capture microdissection, RNA isolation and amplification. Figure S1. Validation of quality and quantity of amplified RNA from laser capture microdissected cell types from H. praealtum ovule sections.            Antisense (AS) probes were used in D and F and control sense (S) probes in E and G.
In situ analysis of the other two genes in H. praealtum and in sexual H. pilosella are shown in Figure S3. Bar = 20 µm. least 80% of the length of the shorter contig. Where equally high-scoring or redundant matches were found, a single match was counted. Contig sequences were compared to data bases with annotation and pairwise GO analyses. Statistics for these analyses are presented in Table III. B, Box plots of ribosomal and ubiquitin-associated gene expression in sexual Arabidopsis in nucellar ovule tissues compared with developing female gametophytes.  Tables   Table I.   a Number of low-expressed ovary enriched genes found in assembled contigs is divided by the number of genes confirmed by RT-PCR analysis found in amplified RNA used for 454 sequencing ( Figure 1C; Table S2). b Correlation of the number of normalized sequence reads in assembled contigs and their expression in the three LCM samples by Q-PCR. Five randomly chosen contigs were compared for each cell type and Pearson correlation with standard deviation is shown. c Contig sequences were considered matched if reaching a blastn E value threshold < 1E -10 and minimum of 80% overlap