Regulatory mechanisms underlying oil palm fruit mesocarp maturation, ripening, and functional specialization in lipid and carotenoid metabolism

Fruit provide essential nutrients and vitamins for the human diet. Not only is the lipid-rich ﬂeshy mesocarp tissue of the oil palm ( Elaeis guineensis ) fruit the main source of edible oil for the world, but it is also the richest dietary source of provitamin A. This study examines the transcriptional basis of these two outstanding metabolic characters in the oil palm mesocarp. Mor- phological, cellular, biochemical, and hormonal features deﬁned key phases of mesocarp development. A 454 pyrosequencing-derived transcriptome was then assembled for the developmental phases preceding and during maturation and ripening, when high rates of lipid and carotenoid biosynthesis occur. A total of 2,629 contigs with differential representation revealed coordination of metabolic and regulatory components. Further analysis focused on the fatty acid and triacylglycerol assembly pathways and during carotenogenesis. Notably, a contig similar to the Arabidopsis ( Arabidopsis thaliana ) seed oil transcription factor WRINKLED1 was identiﬁed with a transcript proﬁle coordinated with those of several fatty acid bio- synthetic genes and the high rates of lipid accumulation, suggesting some common regulatory features between seeds and fruits. We also focused on transcriptional regulatory networks of the fruit, in particular those related to ethylene tran- scriptional and GLOBOSA/PISTILLATA-like proteins in the mesocarp and a central role for ethylene-coordinated transcriptional regulation of type VII ethylene response factors during ripening. Our results suggest that divergence has occurred in the regulatory components in this monocot fruit compared with those identiﬁed in the dicot tomato ( Solanum lycopersicum ) ﬂeshy fruit model.

Fruit provide essential nutrients and vitamins for the human diet. Not only is the lipid-rich fleshy mesocarp tissue of the oil palm (Elaeis guineensis) fruit the main source of edible oil for the world, but it is also the richest dietary source of provitamin A. This study examines the transcriptional basis of these two outstanding metabolic characters in the oil palm mesocarp. Morphological, cellular, biochemical, and hormonal features defined key phases of mesocarp development. A 454 pyrosequencingderived transcriptome was then assembled for the developmental phases preceding and during maturation and ripening, when high rates of lipid and carotenoid biosynthesis occur. A total of 2,629 contigs with differential representation revealed coordination of metabolic and regulatory components. Further analysis focused on the fatty acid and triacylglycerol assembly pathways and during carotenogenesis. Notably, a contig similar to the Arabidopsis (Arabidopsis thaliana) seed oil transcription factor WRINKLED1 was identified with a transcript profile coordinated with those of several fatty acid biosynthetic genes and the high rates of lipid accumulation, suggesting some common regulatory features between seeds and fruits. We also focused on transcriptional regulatory networks of the fruit, in particular those related to ethylene transcriptional and GLOBOSA/PISTILLATA-like proteins in the mesocarp and a central role for ethylene-coordinated transcriptional regulation of type VII ethylene response factors during ripening. Our results suggest that divergence has occurred in the regulatory components in this monocot fruit compared with those identified in the dicot tomato (Solanum lycopersicum) fleshy fruit model. Fruit development, maturation, and ripening are complex biological processes unique to plants. The monocotyledonous oil palm (Elaeis guineensis) fruit is a drupe whose thick fleshy mesocarp is exceptionally rich in oil (80% dry mass), making this species the highest oil-yielding crop in the world (Murphy, 2009). The mesocarp is also especially abundant in carotenoids, and crude palm oil is the richest dietary source of provitamin A (Sambanthamurthi et al., 2000;Solomons and Orozco, 2003). Surprisingly, the molecular basis of oil palm fruit development, maturation, and ripening has received very little attention. In contrast, the fleshy berries such as tomato (Solanum lycopersicum) and grape (Vitis vinifera) are considered models due to the wealth of genome resources and genetic transformability. In these species, hormones play key roles. For example, ripening appears to be controlled by both ethylene-dependent and independent pathways, shown by the characterization of the tomato ripening-related mutants ripening-inhibitor (rin), Colorless non-ripening (cnr), Never-ripe, and Green-ripe (Wilkinson et al., 1995;Vrebalov et al., 2002;Barry and Giovannoni, 2006;Manning et al., 2006;Giovannoni, 2007). Abscisic acid (ABA) can also control tomato ripening through the activation of ethylene biosynthesis (Zhang et al., 2009a). In grape, whereas ripening was previously thought to be ethylene independent, ABA appears to control ripening along with auxin and ethylene (Coombe and Hale, 1973;Davies et al., 1997;Chervin et al., 2004Chervin et al., , 2008. Another common regulatory feature of fruit has emerged from functional analyses with the MADS box family of transcription factors (TFs), including RIN and TOMATO AGAMOUS-LIKE1 (TAGL1) from tomato, PLENA from peach (Prunus persica), and FaMADS9 from strawberry (Fragaria 3 ananassa), which have provided evidence for roles during ripening (Vrebalov et al., 2002(Vrebalov et al., , 2009Tadiello et al., 2009;Seymour et al., 2011). The recent identification of diverse MADS expressed during banana (Musa acuminata) ripening raises the questions of whether the function of some MADS box proteins is common to monocots and whether divergence occurred after the separation of dicots and monocots (Liu et al., 2009;Elitzur et al., 2010). It is clear that tomato and grape, with their different modes of ripening, provide an excellent framework to compare the maturation and ripening regulatory mechanisms in diverse eudicot species. However, due to the lack of data, it is not clear whether similar or diverse regulatory mechanisms function during the maturation and ripening in monocot fruit species in general and oil-accumulating fruits tissues such as the oil palm mesocarp in particular. Indeed, the low content or lack of triacylglycerols (TAGs) and provitamin A in these eudicot model species makes them inadequate to understand the regulatory mechanisms that function during the maturation and ripening in the oil-rich mesocarp.
Current knowledge about oil synthesis is derived mainly from research on seeds in which TAGs accumulate during maturation (Bates et al., 2009;Baud and Lepiniec, 2010). In particular, the use of genomics and the model species Arabidopsis (Arabidopsis thaliana) has allowed new insights into seed lipid synthesis (Lu et al., 2009;Zhang et al., 2009b;Li-Beisson et al., 2010) and its regulation . Although genes and enzymes involved in some steps of fatty acid (FA) and TAG synthesis still have to be identified or functionally validated, the knowledge is advanced enough to enable oil synthesis pathway reconstruction from transcriptome data in nonmodel species (Joët et al., 2009). In contrast, far less is known about the molecular basis of lipid metabolism in the fleshy fruits that accumulate high amounts of TAG in their mesocarp, such as avocado (Persea americana), olive (Olea europea), and oil palm. Furthermore, very few lipidrelated genes have been cloned in oil palm (Othman et al., 2000;Nakkaew et al., 2008), and most of the studies on oil metabolism in this species deal with metabolic control analysis experiments using callus cultures (Ramli et al., 2009). To our knowledge, lipid synthesis in the developing oil palm fruit has never been investigated through transcriptome analysis. In contrast, transcriptomic studies of developing seeds have revealed many key features of oil synthesis. The main processes leading to the accumulation of TAGs in the seed begin in the plastid with the de novo formation of acyl chains (up to 18 carbons), followed by desaturation at the D 9 position and FA release, then continues in the endoplasmic reticulum (ER) with the sequential acylation on the three positions of the glycerol backbone, additional FA desaturation, esterification to phosphatidylcholine (PC) and acyl editing, and finally storage within specialized organelles called oil bodies. The identification of a very limited number of expression patterns for the numerous genes involved in FA and TAG synthesis first suggested that the transcriptional network that controls seed lipid metabolism is remarkably coordinated (Ruuska et al., 2002;). In the Arabidopsis embryo, almost all genes involved in de novo FA synthesis show the same timing and pattern of expression (i.e. a bell-shaped pattern whose peak coincides with the onset of lipid accumulation). In contrast, the transcription of genes required for TAG assembly in the ER rises later and remains high during the maturation process. These patterns have also been observed in the copious persistent endosperm of coffee (Coffea arabica), another seed tissue that accumulates oil (Joët et al., 2009). The analysis of Arabidopsis seed microarray data also revealed that the fold increase of most genes involved in TAG assembly and storage was significantly higher than that of genes of the core FA biosynthetic machinery . Whether these important characteristics of seed oil synthesis are conserved and function in nonseed tissues such as the mesocarp of the oil palm fruit remains unknown.
The fact that most genes governing de novo FA synthesis in the plastid share the same temporal transcription pattern strongly suggests that they are coregulated and probably share common cis-and transregulatory elements. The isolation of the Arabidopsis wrinkled1 (wri1) mutant constituted a key milestone in the validation of this hypothesis. This mutant is specifically impaired in seed TAG accumulation and shows lower transcript levels for key enzymes involved in lipid and carbohydrate metabolism (Focks and Benning, 1998;Ruuska et al., 2002). Furthermore, overexpression of WRI1, which encodes an APETALA2 (AP2)/ ethylene-responsive element binding protein family TF, led to increased seed oil content and up-regulation of certain FA biosynthesis genes (Cernac and Benning, 2004;Maeo et al., 2009). Indeed, WRI1 interacts directly with an AW box sequence [CnTnG(n) 7 CG] in the upstream regions of several FA biosynthesis genes (Maeo et al., 2009). In addition, WRI1 expression appears to be under the control of LEAFY COTYLEDON1 (LEC1) and LEC2, two master regulators of seed maturation (Baud et al., 2007;Mu et al., 2008). While an analogous regulatory system appears to function in maize (Zea mays) through the ZmWRI1 and ZmLEC1 orthologs (Shen et al., 2010), a maize ortholog of LEC2 has not been identified, suggesting differences in this key regulatory network between monocots and dicots. Finally, whether similar regulatory systems function in the lipid-rich mesocarp during fruit ripening compared with that during seed maturation is unknown.
The oil palm mesocarp not only accumulates the largest amount of TAGs compared with all other crop species but also large amounts of carotenoids (500 mg g 21 mesocarp dry mass), predominantly aand b-carotene (Ikemefuna and Adamson, 1984). Carotenes are precursors of vitamin A, an essential nutrient for human beings. Carotenoids and their oxygenated derivatives xanthophylls give fruits their pigmentation and play an important role as visible signals to attract animals for seed dispersal (Tanaka et al., 2008). The cleavage of carotenoids also generates apocarotenoids, including the plant hormone ABA, with emerging roles during fruit ripening (Davies et al., 1997;Deluc et al., 2007;Gambetta et al., 2010). In addition, volatile apocarotenoid compounds such as b-ionone and geranylacetone have animal-attracting characteristics with very low odor thresholds and play crucial roles in fruit dispersion (Auldridge et al., 2006a(Auldridge et al., , 2006b. Despite the attractive quantitative and qualitative carotenoid composition of the oil palm mesocarp, very few studies have focused on the transcriptional mechanisms underlying carotenoid gene metabolism or performed metabolite profiling during mesocarp development (Ikemefuna and Adamson, 1984;Khemvong and Suvachittanont, 2005;Rasid et al., 2008). Only two genes, PHYTOENE SYNTHASE (PSY) and 1-DEOXY-XYLULOSE 5-PHOSPHATE SYNTHASE (DXS), have been investigated to date in oil palm. PSY and DXS are involved in the first committed step of carotenoid biosynthesis and the upstream 2-C-methyl-D-erythritol 4-phosphate pathway, respectively. By contrast, virtually all carotenogenic genes have been identified in model plants (Hirschberg, 2001;Howitt and Pogson, 2006;Chen et al., 2010), and this genomic knowledge should facilitate the identification of gene orthologs in the oil palm mesocarp. During the ripening of the fleshy tissue of tomato, chloroplasts redifferentiate to chromoplasts that accumulate large amounts of lycopene, a compound with strong antioxidant properties but no provitamin A activity (Arango and Heise, 1998;Josse et al., 2000). While tomato is a reference for carotenogenesis in fleshy fruit, it does not constitute an optimal model to study the transcriptional regulation of downstream pathways of carotene synthesis.
Our objective is to provide a basis to understand the molecular regulation and coordination of TAG and carotenoid biosynthesis during oil palm mesocarp maturation and ripening compared with those found in seeds and nonoily fruit. We used morphological, histological, and biochemical analyses to define phases of oil palm fruit development, maturation, and ripening, and in parallel, we generated and analyzed the mesocarp transcriptome during maturation and ripening using 454 pyrosequencing. By studying the accumulation of oil and carotenoids and hormonal profiling during oil palm fruit development and analyzing the differentially expressed transcripts, we have characterized key regulatory steps of the specific pathways and corresponding genes that are likely to function during fruit maturation and ripening. We describe the unique characters of lipid and carotenoid metabolism in this species and provide insight into the transcriptional coordination and hormonal regulation in the mesocarp tissue compared with those in developing seeds and fleshy fruits. An overview of the metabolic networks involved in lipid and carotenoid accumulation and transcriptional networks in the oil palm mesocarp will allow comparison with model species and help identify candidate genes and molecular markers associated with these important chemical traits for breeding programs. Under the field conditions and with the genetic material used, the fruit of oil palm completed their development, maturation, and ripening in approximately 160 d. Similar to what occurs in other drupes, a biphasic growth curve was observed, with an initial increase in fruit mass and size measured between 30 and 60 d after pollination (DAP; Fig. 1, A and B). After a 40-d lag period (60-100 DAP), a further increase in fruit mass was observed between 100 and 160 DAP, in particular between 140 and 160 DAP, accompanied by an increase in fruit size (Fig. 1, A and B). A similar pattern was observed with the mesocarp fresh mass, which represents approximately 75% of that of the ripe fruit. Notably, a large increase in dry mass between 120 and 160 DAP reflects lipid accumulation in this tissue (Fig. 1,C and E). Further analysis of selected histological, biochemical, and hormonal parameters allowed us to define five distinct phases of mesocarp development. Phase I, between 30 and 60 DAP, is defined by anticlinal cell divisions and expansion along with the initial increase in fruit mass and size ( Fig. 1, A-C and F). Phase II, between 60 and 100 DAP, is a transition period characterized by a lag in the accumulation of fresh mass and also by peak amounts of indole-3-acetic acid (IAA) and IAA conjugates ( Fig. 1D; Supplemental  Fig. S1). Phase III, between 100 and 120 DAP, is the end of the transition period, during which decreases in auxin, gibberellic acid (GA), and cytokinin metabolites are observed ( Fig. 1D; Supplemental Fig. S1). Phase IV is concomitant with the beginning of maturation, characterized by an increase in mesocarp fresh mass and the beginning of lipid accumulation detected by 120 DAP (Fig. 1, C, E, and G). Phase IV, between 120 and 140 DAP, is characterized by lipid (more than 2 g per fruit) and carotenoid accumulation in addition to the low amounts of all hormone metabolites examined, including auxin, GA and cytokinin, ABA, and ethylene (Fig. 1,D and E;Supplemental Fig. S1). Finally, during the ripening phase V, there is a large increase in the hormones ABA and ethylene, and cell wall detachment related to ripening processes in the mesocarp are visualized (Fig. 1, D Fig. S1). During this phase, dry and fresh fruit mass increase massively, concurrently with lipid and carotenoid accumulation in the mesocarp (Fig. 1, A, C, and E). As observed at 160 DAP, lipids accumulate within subcellular spherical organelles (10-15 mm in diame-ter, six to 12 per mesocarp cell) that occupy the volume of the cells (Fig. 1H). In addition, as observed in ripe fruits, mesocarp cells contain distinct regions, presumably chromoplasts, with high carotenoid concentrations (Fig. 1H).  A total of 29,034 contigs were obtained after clustering and assemblage of good-quality sequence reads (Supplemental Figs. S2 and S3). Read amounts associated with each of the 29,034 contigs were then analyzed using Audic-Claverie and false discovery rate (FDR) statistics, leading to the identification of 2,629 differentially expressed contigs, which are henceforth referred to as contig group I, while the remaining 26,405 contigs with either nondifferential representation or low read abundance are referred to as contig group II. Hierarchical cluster analysis of group I contigs allowed the identification of four major clusters, termed A, B, C, and D, which contain 695, 418, 1,167, and 349 contigs, divided into five, four, five, and three subclusters, respectively (Supplemental Fig. S4). In general, contigs that exhibited a transcription peak at 100, 120, 140, or 160 DAP were grouped in cluster A, B, C, or D, respectively. Gene Ontology (GO) accessions were assigned to the 2,629 group I contigs using the Blast2Go platform (Gö tz et al., 2008). A total of 2,074 contigs (77%) were assigned GO accessions, 488 in cluster A (70%), 377 in cluster B (90%), 908 in cluster C (78%), and 301 in cluster D (86%). To identify differences in the GO annotations of the four clusters that reflect underlying molecular processes, an enrichment analysis of the annotations of each individual cluster compared with the annotations of the 2,074 contigs was performed using the GOSSIP platform (Blü thgen et al., 2005). Cluster B had the most overrepresented GO annotations (131 terms), followed by cluster D (22) and cluster C (four), while cluster A had only underrepresented GO annotations (Supplemental Table S1). Cluster C had the most underrepresented annotations (51 terms), followed by cluster D (43), cluster B (32), and cluster A (17). The most significant underrepresented GO terms in cluster A were related to carbohydrate metabolic processes, including glycolysis, while the most significantly overrepresented terms in cluster B were FA biosynthetic process, cellular carbohydrate metabolic process, and lipid biosynthetic process. The results suggest that in the 120-DAP mesocarp, at the beginning of phase IV, significant changes to the transcriptome coincide with the onset of the observed lipid accumulation (Fig. 1E). By contrast, terms related to nucleic acid binding, nucleus, DNA binding, and lipid catabolic process are overrepresented in cluster C, which suggests that major transcriptional regulatory changes are initiated by the end of phase IV at the onset of ripening. Finally, the overrepresented terms in cluster D relate to carbohydrate metabolic processes, including O-glycosyl compound hydrolysis, glycosyl bond hydrolysis, glucan endo-1,3-b-D-glucosidase activity, chitinase activity, and cellular aromatic compound metabolic processes, which suggests that ripening processes such as cell wall modifications are taking place by 160 DAP.

Phases of Oil
Overall, this analysis indicates that certain changes in the transcriptome are coordinated and reflect the biochemical and physiological processes observed during the maturation and ripening phases of the mesocarp (Fig. 1).
Reconstructing FA and TAG Biosynthetic Pathways in the Oil Palm Mesocarp Using a Pyrosequencing-Based Transcriptome Based on our current knowledge of oil synthesis in seeds, the pyrosequencing-based transcriptome approach enabled us to reconstruct the FA and TAG synthesis pathways in the oil palm mesocarp (Fig. 2). Indeed, for almost all lipogenic steps, contigs that exhibited high similarity to genes from Arabidopsis and other oleaginous species were found (Supplemental Table S2).
Three enzymes appear to participate in the acylation of diacylglycerol (DAG) to TAG in the oil palm mesocarp. First, the presence of two contigs (CL875Contig1 and CL1Contig9221) similar to type 1 and type 2 diacylglycerol acyltransferases (DGAT), respectively, suggests that both classes of DGAT play a role in TAG synthesis in this tissue. Moreover, the route involving PC as the acyl donor may make a similar contribution to the final acylation step of DAG, given that a contig (CL1451Contig2) highly similar to an Arabidopsis phospholipid:diacylglycerol acyltransferase (AtPDAT1; At5g13640) had comparable total read amounts to that of DGAT1-and DGAT2-like contigs (about 40 reads per 200,000). Furthermore, no contig was found for Arachis hypogea cytosolic DGAT or the Arabidopsis bifunctional wax ester synthase/DGATs, nor did we identify contigs for an acyltransferase with membrane-bound O-acyltransferase domains that displayed read levels higher than the DGAT-and PDAT-like contigs described above. Overall, these results suggest that the three DGAT1-, DGAT2-, and PDAT-like contigs identified indeed encode the enzymes responsible for final TAG assembly in the oil palm mesocarp.
Because the reaction catalyzed by PDAT also produces lysophosphatidylcholine, a lysophospholipid acyltransferase is necessary to cooperate with PDAT to regenerate PC from lysophosphatidylcholine. Interestingly, a contig (CL212Contig1) highly similar to AtLPLAT1 and AtLPLAT2, two lysophospholipid acyltransferases recently characterized in Arabidopsis (Ståhl et al., 2008), had read amounts similar to the PDAT-and DGAT-like contigs. It is also possible that this lysophospholipid acyltransferase-like contig participates in acyl editing (Bates et al., 2009), although the requirement of this mechanism in a tissue producing an oil poor in polyunsaturated FAs is unknown. In this regard, no contig with high similarity to Arabidopsis phospholipase A2 was found.
Finally, no contig was identified for the recently characterized phosphatidylcholine:DAG cholinephosphotransferase, which is encoded by ROD1 and con-tributes to PC-DAG interconversion in developing Arabidopsis seeds together with CDP-choline:DAG cholinephosphotransferase (Lu et al., 2009). Furthermore, the absence of any oleosin transcript in our libraries may provide an explanation for the considerable size of the oil droplets that accumulate in the oil palm mesocarp (10-15 mm; Fig. 1H). This size is similar to that of the large lipid bodies in the seeds of Arabidopsis lines in which oleosins were suppressed or severely attenuated (Siloto et al., 2006).

FA Synthesis in the Plastid and TAG Assembly in the ER Are Governed by Two Different Transcriptional Programs
The reconstruction of the FA and TAG biosynthetic pathways revealed that de novo formation of acyl chains in the plastid and TAG assembly in the ER follows two distinct transcriptional programs. Indeed, almost all genes involved in glycolysis and FA synthesis in the plastid were from group I, which suggests Contig names followed by asterisks indicate contigs showing significant variations during development according to Audic-Claverie and FDR statistics (contig group I). Gene expression levels at 100, 120, 140, and 160 DAP are indicated with colored bars. For the developmental stage displaying maximal expression level, the normalized transcript abundance, expressed as the number of transcripts per 200,000 transcripts, is given. For the other stages, expression levels are indicated as percentages of the maximal normalized transcript abundance of the gene, as described in the color code from 0% (white) to 100% (dark blue). ACC BC, Biotin carboxylase subunit of heteromeric acetyl-CoA carboxylase (ACCase); ACC Cta, carboxyltransferase a-subunit of heteromeric ACCase; ACC Ctb, carboxyltransferase b-subunit of heteromeric ACCase; CPT, diacylglycerol cholinephosphotransferase; DGAT, acyl-CoA:diacylglycerol acyltransferase; EAR, enoyl-acyl carrier protein (ACP) reductase; ENOp, enolase; FAD2, oleate desaturase; FAD3, linoleate desaturase; FATA, acyl-ACP thioesterase A; FATB, acyl-ACP thioesterase B; GPAT, glycerol-3-phosphate acyltransferase; HAD, hydroxyacyl-ACP dehydrase; KAR, ketoacyl-ACP reductase; KAS I, ketoacyl-ACP synthase I; KAS II, ketoacyl-ACP synthase II; KAS III, ketoacyl-ACP synthase III; LACS, long-chain acyl-CoA synthetase; LPAAT, 1-acylglycerol-3-phosphate acyltransferase; LPCAT, 1-acylglycerol-3-phosphocholine acyltransferase; MAT, malonyl-CoA:ACP malonyltransferase; PAP, phosphatidate phosphatase; PDAT, phospholipid:diacylglycerol acyltransferase; PDH, dihydrolipoamide acetyltransferase, E2 component of pyruvate dehydrogenase complex; PK, pyruvate kinase; SAD, stearoyl-ACP desaturase.
strong developmental transcriptional regulation of this pathway in the mesocarp. In addition, almost all contigs of the core FA biosynthetic machinery showed maximal transcription at 120 DAP and grouped in cluster B, as determined by hierarchical clustering analysis (Supplemental Fig. S4), and displayed high read levels (average of 424 per 200,000 reads, up to 1,993 reads for enoyl-ACP reductase). Furthermore, the transcription peak of the core FA biosynthetic machinery coincided with the onset of oil accumulation in the mesocarp at the beginning of the maturation phase IV (Figs. 1E and 2). By contrast, for the TAG assembly pathway, only contigs for oleate and linoleate desaturases (FAD2 and FAD3) were found within group I. For the enzymes involved in the sequential acylation on the three positions of the glycerol backbone, PC-DAG interconversion, and acyl editing, contigs were identified in group II only, suggesting very little or no developmental control of transcription of genes involved in TAG assembly (Fig. 2). Moreover, contigs related to TAG assembly displayed no clear peak but rather showed low (81 reads per 200,000) and constant read amounts. This unambiguous transcriptional dissimilarity between the pathways in the two compartments and the high homogeneity of patterns observed within each block suggest that FA synthesis in the plastid and TAG assembly in the ER are controlled by two different transcriptional programs.
Because transcripts encoding enzymes involved in de novo FA synthesis were grouped in cluster B, which is characterized by profiles that peak at 120 DAP, and in particular in subcluster B1 (Supplemental Fig. S4), a careful examination of the most highly represented contigs of this subcluster, which contains only 148 contigs, was carried out in order to identify potential regulatory elements of FA synthesis. This analysis revealed that the most abundant contig (CL1Contig558, renamed here EgLIP1) of this subcluster is highly similar to a TAG lipase class 3 family protein, which could account for the intensive TAG hydrolysis in the oil palm mesocarp found previously (Ngando Ebongue et al., 2006). The second most abundant contig within subcluster B1 (CL1Contig8373) has high similarity to the transcriptional regulator WRI1 (Supplemental Table  S2), demonstrating that the pyrosequencing-based approach chosen in this study together with appropriate statistics constitute a reliable and straightforward method to examine the coregulation of genes involved in a given pathway and to identify candidates involved in their regulation.
A Transcript Encoding a TF Similar to WRI1 Is Highly Expressed in the Mesocarp during FA Biosynthesis CL1Contig8373 belongs to the WRI clade (Supplemental Fig. S5), which contains WRI1 (with 68% identity) and two other WRI1-like sequences from Brassica napus (BnWRI1, ABD72476; Liu et al., 2010) and maize (ZmWRI1, AY103852; Shen et al., 2010), for which similar WRI function was shown in the seed. The contig CL1Contig8373, renamed EgAP2-2, peaks at 120 DAP, which corresponds to the onset of lipid accumulation and the transcript profile peak for many FA biosynthesis genes (Figs. 1E and 2). From the 19 transcripts expressed in the mesocarp encoding FA and TAG biosynthetic enzymes that contained a proximal upstream region long enough (more than 150 bp), canonical AW box elements were found in six transcripts (Supplemental Fig. S5). All of the transcripts encode enzymes similar to those that are active in the plastid and involved in glycolysis (ENOp and PK) and FA synthesis (ACCase CTa, KASIII, SAD, and FATB). The 15-bp element described by  was not found in the 5# untranslated region of oil palm sequences available.
WRI1 is known to be a target of LEC2 and LEC1, which are both master regulators of the seed maturation process and also involved in FA biosynthesis regulation (Baud et al., 2007;Mu et al., 2008;Shen et al., 2010). The analysis revealed an absence of any sequences with significant similarity to transcripts encoding ABSCISIC ACID INSENSITIVE3/VIVIPAROUS1, FUSCA3, and LEC2, all of which belong to the B3 superfamily. Similarly, no contig with HPA3 subunits of the CCAATbinding factor (IPR003956) was found among group I contigs.

The Massive Carotene Accumulation Precedes ABA Biosynthesis
During mesocarp development, there is a spectacular increase in aand b-carotene contents (expressed as total mesocarp carotenoid per fruit) during the maturation and ripening phases IV and V (Fig. 1E). While prior to 120 DAP only minimal amounts of the xanthophylls lutein and violaxanthin were detected, the total carotene content (a-and b-carotenes) reached very high amounts (718 mg g 21 dry mass) by 160 DAP. Only trace to low quantities of ABA were observed from 80 to 140 DAP, whereas a 6-fold increase in the ABA content was observed between 140 and 160 DAP (approximately 300 ng g 21 dry mass). The amount of the conjugated storage form ABA Glc ester also increased 3-fold between 140 and 160 DAP (Fig. 3).
As with FA and TAG synthesis pathways, the 454 sequencing data set enabled the identification of at least one contig for almost each step of the carotene, xanthophyll, and ABA biosynthetic pathways ( Fig. 3; Supplemental Table S2). However, only three contigs involved in carotenogenesis were found within group I. These three genes, PSY, PDS, and PLASTID TERMINAL OX-IDASE (PTOX) all encode for enzymes involved in the first two committed steps of carotenogenesis: condensation of two molecules of geranylgeranyl diphosphate (PSY), desaturation of the resulting phytoene (PDS), and recycling of the plastoquinol pool used as an electron acceptor for phytoene desaturation (PTOX). Moreover, these three contigs displayed maximal expression levels at 140 to 160 DAP, during which aand b-carotene accumulated massively (Figs. 1E and 3).
Therefore, these findings suggest that the first two committed steps of the carotenoid pathway, accomplished by PSY and PDS, are the major transcriptional control points for mesocarp carotenoid metabolism. Furthermore, despite the fact that the other carotenogenic genes were found only within group II, many displayed similar expression profiles. For example, two contigs similar to 15-CIS-z-CAROTENE ISOMERASE and z-CAROTENE DESATURASE, also involved in the desaturation steps, and fibrillins that are associated with carotenoid storage, all have higher read amounts at 160 DAP. Overall, these results suggest a coordinated transcriptional regulation of carotenogenesis in the oil palm mesocarp.
By contrast, genes involved in the upstream pathway, which is common with the biosynthesis of other isoprenoids, appeared to be expressed at lower levels, although there are putative paralogs for both DXS and FARNESYL PYROPHOSPHATE SYNTHASE expressed during the carotenogenic phase. Similarly, genes acting downstream of the desaturation steps, in particular those encoding LCY-E and LCY-B, which direct lycopene toward aor b-carotene, were expressed rather weakly. However, considering that lycopene was not detected and that the mesocarp accumulates exceptional amounts of carotenes, LCY-E and LCY-B were obviously not rate-limiting steps. Therefore, as described previously for the TAG assembly pathway, this work provides evidence that a given metabolic pathway may be highly active without requiring that all biosynthetic steps display high and regulated transcript abundance.

Chloroplasts and Chromoplasts Share a Common Repertoire of Carotenogenesis Regulators
In contrast to our extensive knowledge of the enzymatic steps required for carotenoid synthesis in plants, relatively little is known about the regulatory mechanisms of this pathway (Lu and Li, 2008). Carotenoid content in the chromoplast has been related to the transcriptional control of genes involved in light signaling (PHY-A; UV-DAMAGED DNA-BINDING PRO-TEIN1 [DDB1]) and plastid morphogenesis (Or; Liu et al., 2004;Lu et al., 2006;Cazzonelli and Pogson, 2010), while for chloroplasts, genes involved in ethylene signaling (RAP2.2), chromatin modification (SDG8), and light signaling (DEETIOLATED1 [DET1] and PHYTO-CHROME-INTERACTING FACTOR1 [PIF1]) were reported (Mustilli et al., 1999;Welsch et al., 2007;Cazzonelli et al., 2009;Toledo-Ortiz et al., 2010). In the oil palm mesocarp, contigs similar to regulators from both types of plastids were found. For example, DDB1 and a histone methyltransferase with a SET domain and Or and DET1, for chloroplasts and chromoplasts, respectively, were found to have peak transcript amounts at 140 DAP, which presumably coincides with the chloroplast-to-chromoplast transition, when carotenoids accumulate (Fig. 3). Similarly, contigs encoding type VII ethylene response factor (ERF)/RAP2.2 TFs, which may mediate PSY and PDS expression (Welsch et al., 2007), were found to be up-regulated at 140 DAP (Fig. 4B). Finally, a contig similar to the gene encoding a PIF, shown to down-regulate the accumulation of carotenoids by specifically repressing PSY To determine the involvement of ethylene in the function, maturation, and ripening of the oil palm mesocarp tissue, we first analyzed ethylene produced in the mesocarp during selected stages of development ( Fig. 4A; Supplemental Fig. S1). The lowest amounts of ethylene were detected in the 120-DAP mesocarp (207 nL g 21 fresh weight h 21 ), while the maximum amounts were detected at 160 DAP (7,263 nL g 21 fresh weight h 21 ), which represents a 35-fold increase in ethylene production. Within group I, we found contigs with similarity to transcripts that encode the three key enzymes involved in ethylene biosynthesis (Fig. 4A). We also identified transcripts that encoded proteins similar to 1-aminocyclopropane-1 carboxylic acid oxidase (ACO), the most prevalent of which (CL1Con-tig999) was similar to the Arabidopsis ACO4 and peaked at 140 DAP, which coincides with the large increase in ethylene production in the mesocarp. Notably, some ACO transcripts were more abundant at 100 DAP, while other distinct transcripts strongly increased at 140 DAP. There were also three contigs within group I with similarity to Asp aminotransferaselike (ATT), all (in particular CL1Contig505) with peak transcript abundance at 140 DAP. ACC synthase (ACS) enzymes are part of the ATT family, and based on their expression profiles, these contigs are good candidates to participate in ethylene biosynthesis in the mesocarp. We also identified two contigs with low read amounts similar to Arabidopsis ACSs. These data support the hypothesis that ethylene production in the mesocarp is controlled through a similar process as found in tomato, namely an autoinhibitory system 1 ethylene production during development, followed by a transition to a climacteric-like autocatalytic system 2 production during ripening through the transcriptional activation of specific transcripts for ACS and ACO (Nakatsuka et al., 1998;Barry et al., 2000;Yokotani et al., 2009).
To further examine the activity of ethylene-related processes in the mesocarp, we searched for contigs with similarity to transcripts that encode ethylene receptors, signal transduction factors, and ERFs. Notably, we identified contigs with similarity to key factors within group I, with the exception of ETHYL-ENE INSENSITIVE2 (EIN2) and CONSTITUTIVE TRI-PLE RESPONSE1, which were within group II (Fig.  4B). As observed with the transcripts for ethylene biosynthesis enzymes, the transition from system 1 to system 2 ethylene production is marked by a coordinated increase in a large number of ethylene-related transcripts, in particular those encoding ERFs. Furthermore, certain components of ethylene perception and signal transduction have transcript profiles that coincide with either system 1 or 2. For example, two contigs similar to the ethylene receptors ETHYLENE RESPONSE1 and EIN4 had peak read amounts at 100 DAP, while a contig similar to the key transcriptional activator EIN3 peaked at 140 DAP. The most striking observation is the number of ERF-like transcripts with expression profile peaks at 140 DAP. Phylogenetic analysis with Arabidopsis AP2 domains distinguished the 10 major groups (Nakano et al., 2006); however, no oil palm fruit sequences were assigned to groups VI and VIII (Supplemental Fig. S6). These results indi-cated that the most predominant ERF found in the oil palm mesocarp was the type VII, the majority of which have peak transcript amounts at 140 or 160 DAP, during mesocarp maturation and ripening ( Fig. 4B; Supplemental Fig. S6). In contrast, only a single ERF/ AP2 type VII (CL1Contig256) and one type IX (CL1Contig6009) had peak transcript amounts at 100 DAP. At both 120 and 140 DAP, type I and IV ERF/ AP2s were most prevalent, while a single type X was identified at 140 DAP. Overall, a large number of ethylene-related transcripts have expression profiles associated with either a system 1 basal amount of ethylene production or system 2 ethylene production during ripening. In particular, type VII ERF/AP2s represent a major transcriptional response associated with the ethylene burst measured during the maturation and ripening phases IV and V of the mesocarp.

A Diversity of MADS Box TFs Expressed in the Oil Palm Mesocarp at Keys Stages
Our pyrosequencing-based approach allowed the identification of 14 sequences that contained at minimum the MADS box domain and were retained for further phylogenetic analysis ( Fig. 5; Supplemental  Fig. S7). The 14 oil palm MADS box sequences belong to seven different subfamilies defined by the genes from Arabidopsis APETALA1 (AP1), PISTILLATA (PI), AP3, AGAMOUS (AG), SEPALLATA (SEP), and AGAMOUS-LIKE6 (AGL6; Becker and Theissen, 2003) in addition to the tomato TM3 subfamily (Supplemental Table S2).
The class of MADS most represented in the mesocarp belongs to the AGL2/SEP subfamily, including three within group I and two within group II (Fig. 5A). Notably, there were no oil palm sequences found in the SEP subclades 1 and 5, which contain either the strawberry FaMADS9 or the tomato LeRIN, respectively, both of which perform key functions during fruit development and ripening (Vrebalov et al., 2002;Seymour et al., 2011). Overall, the five contigs within the AGL2/SEP subfamily are grouped within two distinct subclades supported by significant bootstrap values. Within the SEP3 subclade, EgAGL2-1, reported in oil palm male and female inflorescences , CL1Contig8010/EgAGL2-2 (100% amino acid identity), and CL1Contig6409 form a monophyletic group, while CL1Contig3848 forms a monophyletic group with the banana fruit MaMADS2 and MaMADS4 (Elitzur et al., 2010). Contig CL1Con-tig8010 is the most abundant MADS found in the mesocarp, with an expression profile coordinated during maturation and ripening (Fig. 5A). CL1Con-tig2367 and CL1Contig1295 are found within another subclade that contains uniquely monocot sequences. Notably, CL1Contig2367 peaks at 100 DAP and decreases during maturation and ripening, suggesting a function prior to phase III of mesocarp development. Furthermore, the oil palm sequences systematically form subclades with other monocot sequences and do not group with those from dicot fruit species, which suggests that a diversification of function within this subfamily has occurred within monocots since the separation from eudicots.
There were three contigs identified within the MADS box AG subfamily. CL1Contig1226 is the most abundant and peaks at 140 DAP (Fig. 5B). The three oil palm AG-like contigs analyzed constitute a monophyletic group with the banana fruit MaMADS5 (Elitzur et al., 2010 ; Fig. 5B). The mesocarp fruit AGlike sequences are distinct from EgAG1 and EgAG2 reported in oil palm male and female inflorescences that group together in the eudicot AG-like subclade . The cladogram indicates seven monophyletic groups supported by significant bootstrap values. The PLENA and eudicot AG lineages include TAGL1 and TOMATO AGAMOUS (TAG1), respectively, which have regulatory functions during tomato flower and fruit development, respectively (Seymour et al., 2008;Pan et al., 2010). Notably, these two groups contain sequences exclusively from eudicot fruit species such as peach, apple (Malus domestica), and grape.
From the GLOBOSA/PISTILLATA (GLO/PI) subfamily, a single contig from group I, CL290Contig1, shares 100% nucleic acid identity with EgGLO1 from oil palm identified previously (Supplemental Fig. S7; Adam et al., 2006). The oil palm GLO-like protein is separated from the fleshy fruit eudicot sequences and positions closest to the banana MaMADS6, the only Figure 5. Phylogenetic analysis of oil palm MADS with other MADS genes. A, Phylogenetic analysis of CL1Contig3848, CL1Contig8010, CL1Contig6409, CL1Contig2367, and CL1295Contig1 with other SEP clade protein sequences using the neighbor-joining method based on the multiple alignment of the partial protein MIK sequences (139 residues). B, Phylogenetic analysis of CL1Contig5719, CL1Contig1226, and CL1Contig5512 with other AG clade protein sequences using the neighborjoining method based on multiple alignment of the full-length MIKC protein sequences. ANR1 from Arabidopsis was used as the root. Numbers on the branches are bootstrap values for 100 replicates. The sequences included in this alignment are from Arabidopsis, maize, rice, oil palm, and snapdragon (Antirrhinum majus) and from proteins in selected fruit species such as apple, banana, grape, peach, strawberry, and tomato. The MADS box proteins from our study are in gray. Relative expression is as in Figure 2.
available sequence from a monocot fleshy fruit species in this subfamily (Elitzur et al., 2010). The accumulation of CL290Contig1 is significantly increased during maturation and remains high during ripening.

Profiles of TFs and Transcription Regulators in the Mesocarp Suggest That Coordinated Transcriptional Mechanisms Occur in the Mesocarp
A total of 1,930 group I contigs (73%) were assigned INTERPRO accession annotations, including 150 contigs with functions related to transcriptional regulation, among which 127 contigs were TFs and transcription regulators (TRs) not previously described in this study (Supplemental Tables S3-S6). Overall, there are 37 TF/ TR contigs with peak read amounts within cluster A, 14 contigs at 120 DAP within cluster B, 85 contigs at 140 DAP within cluster C, and 14 contigs at 160 DAP within cluster D. As observed earlier from the GO annotation enrichment analysis (Supplemental Table S1), the largest proportion of contigs related to transcriptional regulation were found at 140 DAP within cluster C.
Within cluster A, with peak transcription amounts at 100 DAP, the most abundant TF/TR is a Cys-2/His-2type zinc finger protein found within subcluster A3 (Supplemental Table S3). Notably, there are two NAC domain proteins that are also abundant within the same subcluster A3 (Supplemental Fig. S4). Indeed, NAC domain proteins are the most represented class of proteins at 100 DAP, three within subcluster A1 and two within subcluster A3, suggesting coordinated regulation of this class of TF/TR. AUXIN RESPONSE FACTORs (ARFs) are the second most represented class, with four ARFs all within subcluster A1, suggesting strong transcriptional coordination among this class of TF/TR. Another notable feature of cluster A is the presence of contigs for genes encoding chromatin modifiers, including a condensin complex component, a DNA methyltransferase, a polycomb group protein, and two contigs for jumonji domain TFs implicated in chromatin regulation during development (Takeuchi et al., 2006). These results indicate that several factors involved in chromatin modifications that can affect changes in gene expression are present and may function at 100 DAP.
Remarkably, only 14 TF/TR contigs are found within cluster B, which suggests that restricted transcriptome regulation occurs at this stage in the mesocarp (Supplemental Table S4). In addition to EgAP2.2, the three ERF/AP2 TFs, and the MADS box described in the previous sections (Figs. 4B and 5), there are two calmodulin-related signaling proteins, one IAA/AUX protein, one zinc finger protein similar to the maize Indeterminate1, one Golden2-like protein that regulates chloroplast development and the expression of the photosynthetic apparatus, one bZIP, and a SEUSS-LIKE3 TR.
In cluster C, in addition to the type VII ERF/AP2 TFs and the MADS box TFs described above, two contigs for bZIP TF were found in subclusters C1 and C4 (Supplemental Table S5). Other classes that are well represented in cluster C include five NAC domain proteins, four zinc finger proteins, three BELL1-LIKE HOMEODOMAIN proteins, and three WRKY TF proteins. Finally, two contigs (CL1Contig6719 and CL2179Contig1) with similarity (E . 1e-30) to the homeodomain-Leu zipper protein LeHB-1, which binds to the promoter of LeACO1 to activate transcription during ripening (Lin et al., 2008), were identified.
In cluster D, 11 out of 14 contigs are found within subgroup D3 (Supplemental Fig. S4), which suggests that tight transcriptional coordination occurs at the ripening stage (Supplemental Table S6). In addition to the AGL2/SEP described above (Fig. 5A), other prevalent contigs found within subcluster D3 included a basic helix-loop-helix protein, a COP9 signalosome complex subunit 3, and a NAC domain protein. Indeed, NAC domain proteins made up the most represented class in cluster D; notably, two of the NAC domain proteins had the highest similarity (E . 1e-50) to the tomato NAC-NOR, reported as a component of the ethylene-dependent ripening regulation in tomato (Giovannoni, 2004).
Finally, no contigs for CNR or SlAP2a, regulatory factors that also directly or indirectly affect ethylene production during tomato ripening (Thompson et al., 1999;Chung et al., 2010), were found within group I.

The Oil Palm Mesocarp Is an Original Fruit Model to Examine Regulatory Mechanisms That Function during Fruit Maturation and Ripening
The oil palm mesocarp presents an original model to examine the regulatory networks in a monocot fruit tissue subject to climacteric ripening and in which high amounts of lipids accumulate. This study provides, to our knowledge for the first time, a detailed description of the original physiological and biochemical characteristics along with a thorough analysis of the underlying transcriptional activities, combined to develop a model that describes major events that occur during the defined phases of mesocarp development (Fig. 6).
Similar to during tomato development, a decrease in auxins, GAs, and cytokinins (Gillaspy et al., 1993) is also observed prior to mesocarp ripening; however, a second auxin peak that overlaps with that of ethylene during tomato ripening is not observed in the oil palm mesocarp. ABA is also present early during tomato development, and a recent detailed study in tomato revealed that an ABA peak occurs just prior to that of ethylene during ripening (Zhang et al., 2009a). By contrast, in the oil palm mesocarp, there is a large simultaneous increase in both ethylene and ABA during fruit ripening that suggests regulatory functions and/or interactions for these two hormones in the mesocarp.
In the tomato fruit, the increase in ethylene production at the onset of ripening coincides with a transition Transcriptional Regulation in a Lipid-Rich Fleshy Fruit Plant Physiol. Vol. 156, 2011 575 from autoinhibitory (system 1) to autostimulatory (system 2) ethylene production. The tomato model predicts that system 1 functions prior to ripening while system 2 operates during ripening and that the two systems are regulated by coordinated developmental and ethylene-induced expression of individual ACO, ACS, and tomato ethylene receptor genes (Nakatsuka et al., 1998;Barry et al., 2000;Yokotani et al., 2009). In this study, an increase in ethylene production characteristic of a climacteric burst is observed during phases IV and V. Furthermore, the coordinated transcriptional activation of subsets of transcripts for ACO and ACS-like genes (Asp aminotransferase-like genes) suggests that a transition from system 1 to system 2 occurs in the oil palm mesocarp during the maturation phase IV (Fig. 6). Indeed, by 140 DAP, a sharp increase in transcripts for ethylene biosynthetic enzymes and ERFs in the mesocarp is observed that resembles system 2-type ethylene production. Notably, six ERFs with peak read totals at 140 or 160 DAP are type VII ERFs similar to those implicated in the ripening of tomato, kiwi (Actinidia deliciosa), and plum (Prunus domestica; Tournier et al., 2003;Wang et al., 2007;El-Sharkawy et al., 2009;Sharma et al., 2010;Yin et al., 2010). Phylogenetic analysis with the oil palm se-quences and selected type VII ERF signature domains (Nakano et al., 2006) from Arabidopsis suggest conservation between monocots and dicots and that gene duplication events may have occurred for multiple type VII genes to be expressed in the mesocarp ( Fig. 4; Supplemental Fig. S6). A total of 13 NAC domain-containing proteins within clusters A, C, and D indicate that this class of TFs is one of the most highly represented in the mesocarp. Indeed, NAC domain proteins are key participants in TF networks and play central roles in plant development (Olsen et al., 2005). While NAC domain TFs were the most prominent class at 100 DAP, four ARFs were also observed that coincide with the IAA metabolite peak, suggesting auxin-related functions at this stage (Fig. 6). Furthermore, two other NAC domain contigs were identified with similarity to the tomato NAC-NOR (AAU43922.1) gene, the mutation in which has been reported to be responsible for the nor ripening mutant (Giovannoni, 2004). Notably, both of these NAC domain contigs have expression profiles associated with mesocarp ripening, which suggests functional similarity to the tomato NAC-NOR gene. While these results suggest conserved functions for some NAC domain proteins during ripening, very little is known about their roles during fruit development. In oil palm, the large number of transcripts observed for this class of TFs with different expression profiles that correspond to key phases of mesocarp development may reflect a high number of gene duplication events followed by expression domain changes that occurred in this family since the separation between monocots and eudicots.

The Expression of MADS Box Genes from the SEP, AG, and GLO Subfamilies Characterizes Key Phases of Mesocarp Development and Function
In this study, we found a diversity of MADS box transcripts with expression associated with the different phases of mesocarp development. MADS box proteins are key regulators of fruit ripening (Vrebalov et al., 2002(Vrebalov et al., , 2009Ito et al., 2008;Itkin et al., 2009;Giménez et al., 2010;Jaakola et al., 2010;Seymour et al., 2011). In particular, the protein RIN within the SEP4 subclade controls the induction of the tomato ACS transcripts involved in system 2 ethylene production (Barry et al., 2000). Recently, a SEP1/2 subclade MADS box from the nonclimacteric fruit strawberry (FaMADS9) was shown to play a key role in ripening (Seymour et al., 2011). To date, very little is known about the SEP subgroup involvement in monocot fruit ripening. Three SEP3 homologs, MaMADS1, -2, and -4, expressed in ripening banana fruit tissues share low amino acid sequence similarity (55%-62%) with RIN, and none thus far complements the tomato rin mutation (Elitzur et al., 2010). However, MaMADS2 is expressed in the pulp prior to the burst of ethylene production and is not induced by ethylene treatments. In the oil palm mesocarp, there were at least five SEP homologs expressed, two of which (CL1Contig8010 and CL1Contig3848) within the SEP3 subclade increase during the ethylene burst. Interestingly, phylogenetic analysis indicated that CL1Contig3848 grouped closely to the banana MaMADS2, while neither of these homologs grouped closely with the tomato RIN or with the strawberry FaMADSS9. These results support the hypothesis that the MADS box SEP-like proteins are key factors of ripening not only in climacteric and nonclimacteric fruits but also in monocot and dicot fruits. Interestingly, FaMADS9 and RIN belong to different SEP subgroups, SEP1/2 and SEP4, respectively, and the strawberry sequence most similar to RIN (FaSEP4) is expressed at very low levels in the fruit. These results demonstrate that while SEPlike proteins indeed operate in the ripening process of both climacteric and nonclimacteric fruits, sequence divergence has occurred that may reflect differences between the two ripening models. While the downstream targets of FaMADS9 are unknown, the evidence indicates that some common functions may exist during both climacteric and nonclimacteric ripening for SEP subgroup MADS box genes in these two dicots. Our data support the hypothesis that in monocot fruits, SEP3 homologs function upstream of ripen-ing processes analogous to the tomato RIN and the strawberry FaMADSS9, and the divergence observed within the SEP subgroup may reflect important differences between fruit ripening systems, including those of monocots.
Another tomato MADS box protein, TAGL1 within the AG subgroup, also functions upstream of ethylene production, apparently through the control of LeACS2 expression yet independently from RIN (Itkin et al., 2009;Vrebalov et al., 2009;Giménez et al., 2010). Very little is known about the role of AG clade MADS box proteins during nonclimacteric or monocot fruit ripening. In grape, there are at least three AG-like transcripts expressed in the fruit but unassociated with ripening (Boss et al., 2001(Boss et al., , 2002Díaz-Riquelme et al., 2009). In banana, there appears to be a least one AGlike transcript (MaMADS5/MuMADS1) expressed in the pulp that is induced by ethylene (Liu et al., 2009;Elitzur et al., 2010). In the oil palm mesocarp, we identified at least three transcripts that encode AG clade MADS box proteins. Notably, one transcript (CL1Contig5512) accumulates with no relation to the ethylene burst, while another (CL1Contig1226) accumulates during the initial burst of ethylene production and related transcriptional activity. The domains encoded by these two sequences, along with that of a third oil palm sequence and the banana MaMADS5, form a monocot monophylogenetic subclade. Furthermore, this monocot subclade is distinct from those containing the dicot tomato fruit TAGL1 and TAG1 or the oil palm EgAG1 and -2 expressed in the oil palm female flower (Adam et al., 2007a). These results suggest that gene duplications in the oil palm genome have occurred that gave rise to at least three distinct yet closely related AG-like sequences with different expression profiles during mesocarp development, maturation, and ripening, while EgAG1 and -2 expression and function may be limited to flower development. However, it is not known whether the three AG-like sequences expressed in the mesocarp are also expressed during flower development or in other oil palm tissues. Together, these results support the hypothesis that some AG clade MADS box proteins have diverged to gain expanded expression with new functions in the fruit beyond those performed during flower formation. Furthermore, as with the SEP clade MADS box proteins, a divergence of AG-like proteins between dicots and monocots is clearly observed that may reflect underlying functional differences between these two plant groups in relation to fruit development and ripening.
A transcript (EgGLO1) for a GLO/PI family MADS box was also observed with expression related to the ethylene burst. To our knowledge, EgGLO1 is the first within this class to be observed with a fruit ripeningrelated expression profile in any species. Indeed, while GLO members have been identified in eudicot fleshy fruit species such as grape, apple, and peach and in the monocot banana fruit, thus far none is associated with fruit ripening (Yao et al., 2001;Busi et al., 2003;Hileman et al., 2006;Poupin et al., 2007;Zhang et al., 2008;Díaz-Riquelme et al., 2009;Elitzur et al., 2010). In contrast, the oil palm mesocarp EgGLO1 was expressed in relation to ethylene production and related transcriptional activity during maturation and ripening. Notably, the EgGLO1 nucleotide sequence is identical to the transcript observed previously during flower development . Indeed, there were two GLO/PI family MADS box transcripts (EgGLO1 and EgGLO2) found to be expressed during flower development, but only EgGLO1 was observed in the mesocarp. Therefore, either the same gene that functions during flower development is also active and plays a role during fruit ripening or a duplication has occurred and another similar gene has diverged to acquire a new fruit-related function.
Overall, these data indicate the involvement of MADS box genes, in particular SEP-like, AG-like, and GLO-like, during maturation and ripening of the oil palm mesocarp. There appears to have been a substantial amount of functional diversification in the SEP-like and AG-like subfamilies, given the number of new transcripts found compared with those observed previously in studies of the oil palm flower structure (Adam et al., , 2007a(Adam et al., , 2007b. This diversification appears to have arisen through the expansion of the expression of genes that either have retained a function in the flower or have specialized functions within the fruit. Indeed, several sequences identified in this study, including those from the SEP-like and the GLOlike subfamilies, were identical at the nucleotide level to those observed in the oil palm flower and may be the same gene. Finally, the expression of certain MADS box transcripts either positively or negatively coincides with the burst of ethylene production that occurs by 140 and 160 DAP. This raises the question of the relationship between ethylene production and related transcriptional activity and whether these MADS box gene products play regulatory roles either upstream or downstream of the ethylene burst during mesocarp ripening.

The Last Acylating Step of TAG Assembly May Involve Several Complementary Routes
One of the most notable findings from the reconstruction of the oil synthesis pathways in the oil palm mesocarp is the possible contribution of three enzymes for the acylation of DAG to TAG (i.e. both type 1 and type 2 DGAT and PDAT). Although the literature shows that the three enzymes, encoded by three separate gene families, are all capable of catalyzing the final acylation step, their individual contributions in plant tissues that accumulate oil remain poorly understood. The similar transcript amounts detected in this study suggest that the contribution of these three enzymes could be of similar importance in the mesocarp of oil palm. Based on transcription patterns of DGAT1, DGAT2, and PDAT in developing seeds of various plants, it was suggested that DGAT2 and PDAT could be major contributors of TAG synthesis in seeds that store large amounts of epoxy and hydroxy FA (Li et al., 2010a). However, a recent study also indicates high transcription of both DGAT1 and DGAT2 in the mesocarp of olive during the stage of oil synthesis, while olive oil does not contain any unusual FA (Banilas et al., 2011). In that study, the authors hypothesize that DGAT2 could play a specific role during oil droplet enlargement, which occurs at late stages of olive mesocarp maturation. Furthermore, it was recently demonstrated that both DGAT1 and PDAT1 contribute to TAG synthesis in the developing Arabidopsis embryo (Zhang et al., 2009b). A major role for PDAT in the developing oil palm mesocarp is substantiated by the identification of a contig very similar to its necessary partner, lysophospholipid acyltransferase, whose transcript levels were comparable to those of other transcripts involved in TAG assembly.

Divergence of the FA Synthesis and TAG Assembly Transcriptional Regulation
The second major result drawn from the analysis of the lipid-related transcriptome is that the core FA synthetic machinery of the oil palm mesocarp is remarkably coordinated at the transcriptional level (Figs. 2 and 6). Large-scale transcriptomic studies focused on oil synthesis in plants remain scarce. However, since the same key result was identified in the developing embryo of Arabidopsis (Ruuska et al., 2002; and the persistent living endosperm of albuminous coffee seeds (Joët et al., 2009), it is tempting to speculate that this feature is common to plant oilstoring tissues, independent of their origin. In all three cases, a sharp increase in plastidial gene transcription occurs at the onset of lipid accumulation. Similar to that observed in the Arabidopsis embryo and the coffee endosperm, ER genes involved in TAG assembly displayed an expression profile different from that of FA synthesis genes, suggesting distinct regulation programs. However, while the fold increases of most genes involved in TAG assembly were higher than those of the core FA biosynthetic machinery in the Arabidopsis embryo , transcript levels observed in the oil palm mesocarp for genes responsible of TAG assembly remained low and did not show a clear increase concomitant with oil deposition, such as that observed for FA synthesis genes. Since no other tissue accumulates more oil than the oil palm mesocarp, this work clearly indicates that a marked up-regulation of TAG assembly gene transcription is not strictly required for oil accumulation and that the regulation of this pathway may differ considerably among plants and tissues. Based on our transcriptomic data here, one can hypothesize that the metabolic control of oil synthesis in the oil palm mesocarp is mostly controlled by de novo FA synthesis. This hypothesis corroborates previous studies (Ramli et al., 2002(Ramli et al., , 2009) that quantified flux control coefficients of the two blocks, FA synthesis and TAG assembly, in callus cultures of olive and oil palm using the technique of metabolic control analysis.

Phytoene Synthase and Phytoene Desaturase Are the Key Players for Carotenoid Accumulation
Our results clearly revealed a coordinated transcriptional activation of the carotenogenic genes involved in the synthesis of phytoene and subsequent desaturation steps leading to lycopene, while further steps involved in the cyclization of lycopene toward carotenes and upstream isoprenoid synthesis appear to be poorly regulated or not regulated at the transcriptional level. Despite the fact that the oil palm mesocarp stores large amounts of carotenes but not lycopene, the transcriptional regulation of carotenogenesis nevertheless resembles that of the lycopene-rich tomato fruit (i.e. through the transcriptional activation of the first two committed steps of carotenoid synthesis; Ronen et al., 1999). This important finding suggests that the control of carotenoid metabolism in oil palm may first operate by directing the metabolic flux into the carotenoid pathway. This is consistent with previous reports that showed that PSY expression was the rate-limiting step for carotenogenesis in various plant models (Fraser et al., 1994;Shewmaker et al., 1999). In addition, the constitutive overexpression of PSY was found to be sufficient to trigger the accumulation and sequestration of carotenes into crystals in plastids from nongreen tissues of Arabidopsis (Maass et al., 2009). Considering the importance of PSY and PDS up-regulation for carotenogenesis in the oil palm mesocarp and that class VII ERF/RAP2.2 TFs are known to mediate PSY and PDS expression (Welsch et al., 2007), we can speculate that the contigs encoding ERF/RAP2.2 found to be upregulated at 140 DAP participate in the massive carotene accumulation in this fruit. Finally, the concomitant up-regulation of the Or, DDB1, and DET1 regulators suggests that carotenoid accumulation may also be controlled by enhancing sink strength (i.e. the formation of carotenoid sequestration structures during the chloroplast-to-chromoplast transition). Indeed, the Or gene was previously shown to cause high levels of b-carotene accumulation in an orange cauliflower (Brassica oleracea) mutant by enhancing sink strength (Lu et al., 2006). In addition, expression of the Or transgene in potato (Solanum tuberosum) produces deep orangeyellow flesh tubers with an over 6-fold increase of total carotenoid content. This increase in the Or transgenic tubers was found to be associated with the formation of chromoplasts and specific carotenoid sequestration structures, which serve as an effective metabolic sink to facilitate the sequestration and storage of carotenoids (Lopez et al., 2008).
The low abundance of lycopene cyclase transcripts (LCY) was at first quite puzzling, since the oil palm mesocarp does not accumulate lycopene but carotenes. However, this discrepancy can be interpreted in two ways. First, it clearly evokes the low level of transcription of the whole TAG assembly pathway, while the oil palm mesocarp is exceptionally efficient regarding TAG synthesis. One of the major lessons drawn from the combined analysis of both oil and carotenoid tran-scriptomes is that the coordinated up-regulation of all steps of a given pathway is not a prerequisite for the activity of the corresponding metabolism. Besides, the amount of carotenoids synthesized may also depend on the properties of the encoded enzymes and therefore may rely on the polymorphism observed at the sequence level. Indeed, a recent study carried out in cassava (Manihot esculenta) reported that a single nucleotide polymorphism in PSY2, leading to a nonconservative amino acid exchange, results in a dramatic increase of carotenoid accumulation in storage roots (Welsch et al., 2010). This polymorphism occurs in a region that is usually highly conserved among plants (the ALDRWE amino acid sequence; Supplemental Fig.  S8), and the A191D exchange leads to an increased PSY enzymatic activity in cassava. Notably, this conserved region is also affected in oil palm, with an amino acid exchange in the vicinity of the Arg (the L198M exchange; Supplemental Fig. S8). This discovery opens up interesting prospects, and a more in-depth characterization of the EgPSY enzyme should allow us to test whether the exceptional carotene content of the oil palm mesocarp is also influenced by the peculiar amino acid sequence polymorphism in this region.

Toward an Understanding of FA Synthesis Regulation in the Oil Palm Mesocarp
The fact that most genes involved in the core FA biosynthetic machinery share the same temporal transcription pattern strongly suggests that they are coregulated and probably share common regulatory elements. On the basis of recent literature on the control of oil biosynthesis in seeds (Baud and Lepiniec, 2010), we hypothesize that an AP2 TF similar to WRI1 could play this key role in the oil palm mesocarp (Fig. 6). Indeed, we not only found a highly expressed WRI1like contig, named EgAP2-2, that was perfectly coregulated with FA biosynthetic genes but also identified the same AW box sequence in several genes belonging to the core FA biosynthetic machinery, which confirms that this cis-element, considered as the direct binding site for WRI1 (Maeo et al., 2009), is highly conserved among monocots and eudicots (Pouvreau et al., 2011). To our knowledge, our work here reveals for the first time that the transcriptional regulation of FA biosynthesis via WRI1 may not be restricted to seeds but may also operate in other tissues that accumulate lipids, such as the oil palm mesocarp.
These results raise the critical question of whether EgAP2-2 expression in the mesocarp is regulated by master regulators analogous to those that function during seed maturation (Baud and Lepiniec, 2010). The absence of any LEC2-like transcript in the oil palm mesocarp transcriptome is not surprising, since LEC2 presumably exists only in dicot genomes (Li et al., 2010b). In contrast, since LEC1 was recently proposed to regulate WRI1 in maize (Shen et al., 2010), it constituted a viable candidate for EgAP2-2 regulation. However, given that LEC1 is often considered specific to seed tissues, the regulation of EgAP2-2 in the mesocarp may be completely different. Nonetheless, since LEC1 was recently shown to be expressed in fern (Adiantum capillus-veneris) leaves in response to dehydration (Xie et al., 2008), its participation in oil synthesis regulation in the mesocarp of a monocot species could not be excluded. However, we did not identify any contig showing high homology with LEC1-type HAP3, suggesting that a different regulatory cascade controls WRI1 expression in the oil palm mesocarp.

Plant Material Production, Histology, and RNA Preparation for 454 Sequencing
Oil palm (Elaeis guineensis) fruits were harvested at Pobe Centre de Recherche Agricoles Plantes Pérennes Station (Institut National de Recherche Agricole du Bénin) from a dura parent of Deli Dabou origin, within the same self-progeny of a single palm. In this progeny, after two successive self-pollinations, the percentage of heterozygosity is estimated at only 20% to 25% (Cochard et al., 2009). For each stage of development studied, three independent bunches were collected from three distinct individuals of the same genotype. Fifteen spikelets were then collected in the center of each bunch, and five spikelets were randomly sampled from them. From all undamaged fruits withdrawn from each of the five spikelets, three fruits were then randomly sampled. Then, for each stage-bunch-spikelet combination retained, mesocarp, shell, and seed of each fruit were separated and weighed. Dry weight from all samples was measured after 1 night at 103°C. Variation in tissue weight across development was tested using three-way ANOVA and a posthoc Newman-Keuls test.
For histological analysis, samples were fixed in 4% paraformaldehyde and phosphate-buffered saline (PBS) 13 in the presence of 4% 1-ethyl-3-(3dimethylaminopropyl) carbodiimide for 16 h. Thereafter, samples were washed twice for 15 min in PBS 13 and Gly, then twice for 15 min in PBS 13. Finally, samples were dehydrated in ethanol 50% for 1 h and 70°C before embedding in Technovit 7100 resin (Heraeus Kulzer). Semithin sections of mesocarp (4 mm) were stained with toluidine blue, which stains acid compounds such as pectin in blue in an acid environment. To visualize lipids in the mesocarp, fixed samples were sectioned (100 mm) using a vibratom, washed with PBS 13, stained with Nile red, and analyzed by confocal microscopy (Zeiss LM510), with laser excitation at 458 nm and a HFT filter (548/514) using the 403/1.2 numeric aperture. To visualize b-carotenes in mesocarp cells, fruit were sectioned (150 mm) after 16 h in fixation buffer using a vibratom, stored in PBS 13, and then observed by confocal microscopy (Zeiss LM510 meta) with laser excitation at 488 nm and emission at 580 to 600 nm, corresponding to b-carotene emission spectra using the 633/1.4 numeric aperture.
For RNA extractions, mesocarp tissue samples were collected and frozen immediately in liquid nitrogen. Total RNA from mesocarp was extracted as described previously . Using a titanium kit (Roche), cDNAs related to four stages of development (100, 120, 140, and 160 DAP) were tagged independently and then mixed together in one sample for 454 pyrosequencing carried out by GATC Biotech.

Hormonal Analysis
For hormonal profiling analysis, mesocarp was collected from fruit aged between 80 and 160 DAP, lyophilized, and then ground before being sent (3 3 50 mg of dry powder) to National Research Council Canada-Conseil National de Recherches Canada. ABA and ABA metabolites, cytokinins, auxins, and GAs were quantified by ultra-performance liquid chromatography-ESI-tandem mass spectrometry as described in detail by Chiwocha et al. (2003). To measure ethylene release, spikelets consisting of 10 fruits are put in hermetic jars and ethylene production is quantified after 24 h by gas chromatography (flame ionization detector, equipped with a GS-Q column). Values shown are means of duplicate determinations.

Lipid and Carotenoid Determination Analyses
Total lipids were extracted from 1-g samples of freeze-dried powder using a modified Folch method as described previously (Laffargue et al., 2007).
Carotenoid extraction was adapted from the method described previously (Achir et al., 2010), which includes a preliminary TAG removal step. Briefly, oil samples (100 mg) were carefully mixed with 2 mL of acetone containing 15 mg L 21 echinenone as an internal standard and left for 30 min at -20°C, leading to TAG crystallization. TAGs were separated by rapid sampling of the upper acetone phase, which contains carotenoids, and filtered through a 0.45-mm polytetrafluoroethylene filter (Whatman). The carotenoid extract was directly injected into the HPLC column. The column was a polymeric YMC-30 (4.6 mm i.d. 3 250 mm, 5 mm particle size; YMC). Elution was performed using five successive steps at a flow rate of 1 mL min 21 : (1) 100% A for 15 min; (2) linear gradient from 0% B to 20% B for 10 min; (3) linear gradient from 20% B to 90% B for 10 min; (4) linear gradient from 90% B to 100% B for 5 min; and (5) 100% B for 5 min (A = methanol:water [60:40, v/v] and B = methyl tert-butyl ether: methanol:water [68:28:4, v/v/v]). A UV-visible photodiode array detector (Dionex UVD 340U) was used to analyze the chromatograms at a detection wavelength of 450 nm, except for phytofluene (348 nm), phytoene (286 nm), and j-carotene (401 nm). Carotenoids were identified by the combined use of their relative retention times and their wavelength absorption maxima as determined using analytical standards (Carotenature). Quantification was done by combining the standard calibration curves and correction with internal standard peak areas. All analyses were realized in triplicate (from three independent mesocarp samples).

Sequence Analysis, de Novo Assembly, and Contig Annotation
The 454 reads obtained from the 454 sequencing data were analyzed using a specialized version of our tool ESTtik (for EST Treatment and Investigation Kit; Argout et al., 2008) dedicated to the analysis of 454 data. To avoid problems of misassembly, we discarded sequences shorter than 120 bp and low-quality sequences. We then used the Megablast algorithm to detect and eliminate noncoding sequences by comparing the whole reads against fRNAdb (Mituyama et al., 2009) with an E value cutoff of 1e-20. To avoid chimeric consensus sequences, which contain parts of different expressed genes, we modified the parameters of the TGICL (Pertea et al., 2003) assembler to obtain good-quality contigs. We used a minimum overlap percentage identity cutoff of 94% and an overlap length cutoff of 60 bp for the clustering procedure of TGICL and a minimum overlap percentage identity cutoff of 95% and an overlap length cutoff of 60 bp for the assembly. Finally, we predicted peptide sequences from the transcriptomic data using a modified version of the prot4EST pipeline (Wasmuth and Blaxter, 2004). The peptide sequences were annotated using the InterPro-Scan Web service (Zdobnov and Apweiler, 2001) to find protein domains and functional sites by comparing the sequences against the InterPro signature database (Hunter et al., 2009). Contigs were then annotated using the stand-alone version 2.2.16 of the BLAST software (Altschul et al., 1990). Similar sequences were searched within the TIGR6 proteome of rice (Oryza sativa japonica; Uniprot Knowledgebase Swiss-Prot and TrEMBL; Boeckmann et al., 2003), the nonredundant protein sequences database from GenBank, and the nucleotide sequences from the GenBank nucleotide collection NT. To obtain significant results for the whole BLAST analyses, we used an E value cutoff of 1e-5, and the 10 best hits of each BLAST were retained for the annotation. In addition, we used the Blast2GO software (Götz et al., 2008) to retrieve GO terms based on our BLAST analyses.

Statistical Analysis of Read Abundance Profiles
A contig was considered differentially expressed during development when at least one of the three stage transitions (100-120, 120-140, and 140-160 DAP) exhibited a highly significant difference in read abundance at P = 0.01. Differences in read abundance between two stages of development were tested using Audic-Claverie statistics (Audic and Claverie, 1997). P values obtained by the Audic-Claverie test were then transformed using the Bonferroni correction to control FDRs. Hierarchical clustering analysis was used to group contigs according to their transcription profile using the tool developed by Eisen et al. (1998; http://rana.lbl.gov/eisen/).

Phylogenetic Analysis
Phylogenetic trees were constructed based on similarity searches performed with BLASTp programs with default parameters in protein sequence databases provided by the National Center for Biotechnology Information server (http:// www.ncbi.nlm.nih.gov). Phylogenetic analyses were performed on the Phylogeny.fr platform (http://www.phylogeny.fr; Dereeper et al., 2008). Amino acid sequences were aligned with ClustalW (version 2.0.3; Thompson et al., 1994). After alignment, ambiguous regions (i.e. containing gaps and/or poorly aligned) were removed with Gblocks (version 0.91b). The phylogenetic tree was reconstructed using the neighbor-joining method implemented in Neighbor from the PHYLIP package (version 3.66; Saitou and Nei, 1987). Distances were calculated using ProtDist. The Jones-Taylor-Thornton substitution model (Jones et al., 1992) was selected for the analysis. The robustness of the nodes was assessed by bootstrap proportion analysis (Felsenstein, 1985) computed from 100 replicates. Graphical representation and editing of the phylogenetic trees were performed with TreeDyn (version 198.3).

Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. Complete data set of auxin, GA and cytokinin, ABA, and ethylene metabolite analysis.
Supplemental Figure S3. FASTA list of assembled contigs.
Supplemental Figure S4. Hierarchical cluster analysis of group I contigs differentially represented.
Supplemental Figure S6. Phylogenetic analysis of the AP2 domains from the ERFs identified in the mesocarp.
Supplemental Figure S7. Phylogenetic analysis of the MADS domains from mesocarp MADS box proteins.
Supplemental Figure S8. Amino acid alignments of PSY conserved domains of oil palm contig (EgPSY) and others from cassava, Narcissus pseudonarcissus, rice, tomato, and maize.
Supplemental Table S1. GO enrichment analysis between developmental stages of group I contigs.
Supplemental Table S2. Annotations of contigs discussed in the text.
Supplemental Table S3. TFs and regulators expressed in the mesocarp cluster A.
Supplemental Table S4. TFs and regulators expressed in the mesocarp cluster B.
Supplemental Table S5. TFs and regulators expressed in the mesocarp cluster C.
Supplemental Table S6. TFs and regulators expressed in the mesocarp cluster D.