Genome-Based Metabolic Mapping and 13C Flux Analysis Reveal Systematic Properties of an Oleaginous Microalga Chlorella protothecoides1[OPEN]

Integrated flux balance analysis accurately reconstructs phototrophic and heterotrophic metabolism in Chlorella protothecoides. Integrated and genome-based flux balance analysis, metabolomics, and 13C-label profiling of phototrophic and heterotrophic metabolism in Chlorella protothecoides, an oleaginous green alga for biofuel. The green alga Chlorella protothecoides, capable of autotrophic and heterotrophic growth with rapid lipid synthesis, is a promising candidate for biofuel production. Based on the newly available genome knowledge of the alga, we reconstructed the compartmentalized metabolic network consisting of 272 metabolic reactions, 270 enzymes, and 461 encoding genes and simulated the growth in different cultivation conditions with flux balance analysis. Phenotype-phase plane analysis shows conditions achieving theoretical maximum of the biomass and corresponding fatty acid-producing rate for phototrophic cells (the ratio of photon uptake rate to CO2 uptake rate equals 8.4) and heterotrophic ones (the glucose uptake rate to O2 consumption rate reaches 2.4), respectively. Isotope-assisted liquid chromatography-mass spectrometry/mass spectrometry reveals higher metabolite concentrations in the glycolytic pathway and the tricarboxylic acid cycle in heterotrophic cells compared with autotrophic cells. We also observed enhanced levels of ATP, nicotinamide adenine dinucleotide (phosphate), reduced, acetyl-Coenzyme A, and malonyl-Coenzyme A in heterotrophic cells consistently, consistent with a strong activity of lipid synthesis. To profile the flux map in experimental conditions, we applied nonstationary 13C metabolic flux analysis as a complementing strategy to flux balance analysis. The result reveals negligible photorespiratory fluxes and a metabolically low active tricarboxylic acid cycle in phototrophic C. protothecoides. In comparison, high throughput of amphibolic reactions and the tricarboxylic acid cycle with no glyoxylate shunt activities were measured for heterotrophic cells. Taken together, the metabolic network modeling assisted by experimental metabolomics and 13C labeling better our understanding on global metabolism of oleaginous alga, paving the way to the systematic engineering of the microalga for biofuel production.

Green algae in the genus Chlorella spp. are a large group of eukaryotic, unicellular, and photosynthetic microorganisms that are widely distributed in freshwater environments. Chlorella spp. can grow photoautotrophically and were used as a model system in early research on photosynthetic CO 2 fixation (Bassham et al., 1950;Barker et al., 1956;Calvin, 1956). They are also among the very few algal groups capable of using organic carbon for heterotrophic growth, which endows Chlorella spp. metabolic flexibility in response to environmental perturbation. Because of its robust and various metabolic capacities, Chlorella spp. has aroused widespread interest as a potential alga for industrial production of biomass (Lee, 2001), biofuel (Xu et al., 2006), and value-added chemicals (Pulz and Gross, 2004).
Chlorella spp. are among the best oil feedstock microorganisms for the production of biofuel (Gouveia and Oliveira, 2009). It was frequently reported that, under nitrogen-limited environments, carbon overflow (e.g. nitrogen depletion or organic carbon feeding) allows Chlorella spp. to accumulate a high percentage of neutral lipids that can be processed for biodiesel production . Biogenesis of neutral lipids was found in Chlorella spp. cells undergoing glucose bleaching, in which the growth is switched from photoautotrophic to heterotrophic mode and accompanied by chlorophyll degradation. This metabolic transition has been incorporated into a highly efficient biofuel production process (Xiong et al., 2010a), whereas many global changes in metabolism, such as degeneration of the chloroplast, redistribution of carbon flux, and reprogrammed nitrogen assimilation, remain poorly understood.
In our previous studies, we initiated metabolic analysis of Chlorella spp. by steady-state 13 C analysis of proteinogenic amino acids and profiling of metabolic fluxes for heterotrophic Chlorella protothecoides (Xiong et al., 2010b). The quantitative metabolic information can be further expanded to the complete metabolome by means of recent development in isotope-associated mass spectroscopy (MS), which enables the accurate measurement of both isotopic labeling kinetics and intracellular metabolite concentrations Seifar et al., 2008). Intracellular fluxes can be estimated by computational modeling of dynamic isotopic labeling patterns (Young et al., 2011). This methodology allowed for mapping fluxes under autotrophic growth conditions (Young et al., 2011), because it overcame the limitation of traditional steady-state 13 C metabolic flux analysis (MFA), which fails to resolve the flux from a uniformly steady 13 C labeling pattern because of the assimilation of CO 2 as carbon source of autotrophic organisms.
The metabolic information may also be interfaced with genomic knowledge to generate a bird's eye view of the metabolic properties in system level. Metabolic pathways can be reassembled stoichiometrically and simulated by constraint-based flux balance analysis (FBA), which uses a linear programming approach (Lee et al., 2006) to accomplish the optimal solutions of the objective function (usually maximizing the yield of biomass or a specific metabolic product; Edwards et al., 2002;Oberhardt et al., 2009). The reconstructed metabolic models based on multidimensional genome annotation are of great value in guiding metabolic engineering and genetic improvement of photosynthetic microorganisms, such as Synechocystis sp. PCC 6803 (Shastri and Morgan, 2005;Fu, 2009;Knoop et al., 2010;Montagud et al., 2010;Nogales et al., 2012), Arthrospira platensis (Cogne et al., 2003;Klanchui et al., 2012), and Chlamydomonas reinhardtii (Boyle and Morgan, 2009;Chang et al., 2011;Dal'Molin et al., 2011).
The goal of our study is to systematically investigate C. protothecoides metabolism. Our research procedures are outlined in Figure 1, combining the computational and experimental knowledge by in silico reconstruction of the metabolic network. Specifically, based on functional annotation of the genome (Gao et al., 2014), we mapped primary metabolism of C. protothecoides and adopted FBA to simulate the optimal status for cell growth and lipid accumulation. To validate the FBA model, we quantitatively measured core metabolites involved in the metabolic model by liquid chromatography (LC) -MS. Kinetic isotopic tracing of intracellular metabolites was performed to quantify intracellular fluxes. Comparative analysis of 13 C MFA measurement and FBA prediction suggests distinct properties of C. protothecoides in autotrophic and heterotrophic metabolism. Altogether, the studies presented here will detail the Chlorella spp. metabolism and pave the way to biofuel production from this microalga with the predictive ability of a constructed metabolic model.

A Primary Metabolic Network of C. protothecoides
The starting point of metabolic mapping was to reconstruct the primary metabolic network based on the genome of C. protothecoides, which was sequenced recently (Gao et al., 2014). The genomic information contained in the database was reorganized into various essential pathways, such as the Calvin-Benson cycle/the pentose phosphate (PP) pathway, glycolysis/gluconeogenesis, the tricarboxylic acid cycle, biosynthetic pathways of macrobiomolecules (amino acids, nucleotides, UDP-Glc, glycerol-3-P, and fatty acids as well as chlorophyll), etc. (Fig. 2). Some catabolic pathways, such as b-oxidation of fatty acids, were also assembled. For each biochemical reaction, details are listed for gene number, gene ontology number, and enzyme name. The EC number and the Kyoto Encyclopedia of Genes and Genomes identifier for each enzyme are also provided so that the stoichiometry of the reactions can be subsequently looked up and extracted from the online database.
All reactions were considered to be localized into four main compartments: cytosol, chloroplast (Gao et al., 2014), mitochondria (Grant and Hommersand, 1974), and peroxisome (Codd et al., 1972;Chou et al., 2008), where most of the common metabolic reactions take place. Figure 1. Schematic MFA workflow of C. protothecoides using computational and experimental approaches. Briefly, we reconstructed the primary metabolic network of the oleaginous microalga C. protothecoides based on the recently available gene functional annotation (Gao et al., 2014). Intracellular fluxes under different cultivation conditions were simulated, and the optimal solutions were predicted using cell growth and lipid accumulation as the objective functions. The genome-based metabolic model was further constrained by experimental metabolomics and dynamic isotope labeling for INST-MFA (Young, 2014). Results from 13 C MFA measurement and FBA prediction are comparably analyzed.
Corresponding compounds, including gas, nutrients, and intracellular metabolites, were shuttled between compartments through transport systems or passive diffusion. The methodology for localization was described in "Materials and Methods." In the process of reassembling the genome-based information, in total, 270 functional enzymes coded by 461 metabolism-related genes catalyzing 272 biochemical reactions are relocated into their metabolic pathways, suggesting the completeness of primary metabolism in Chlorella spp. Absences of some reactions were identified.
To simulate cell growth and predict intracellular fluxes, biomass and macromolecular compositions were measured and are presented in Table I. Biomass was divided into protein, lipid, carbohydrate, DNA, and RNA, which totally account for 92% and 95% dry cell weight of autotrophic and heterotrophic cells, respectively (Fig. 3, A and B). The significant variations between the two types of cells are the predominant presence of protein in autotrophic C. protothecoides and the vast majority of lipid in heterotrophic biomass, indicating nutrition effect on cell storage. To detail lipid metabolism in our model, we also analyzed the fatty acid profile in C. protothecoides and found predominantly fatty acid chains of 18 carbons with one and two degrees of unsaturation in both autotrophic and heterotrophic cells.
Based on the newly accessible genome knowledge as well as the biomass composition information, we reconstructed the primary metabolic network and established a fundamental model focusing on the core metabolism of C. protothecoides. This is the first genome-based metabolic network of this alga, which could serve as an in silico platform to exploit the intracellular properties of C. protothecoides metabolism. The detailed process is described in "Materials and Methods," and the complete dataset regarding reaction network and gene-protein association of the enzymes and transporters involved in the network is presented in Supplemental Tables S2 and S3. The reconstruction data are also provided in SBML format as Supplemental Text S1 and S2.

Growth Simulation of C. protothecoides under Autotrophic and Heterotrophic Cultivation Conditions
On the basis of the reconstructed metabolic model, growth properties of C. protothecoides in different cultivation conditions were evaluated using FBA. We defined the flux of nutrients and metabolic end products in and out of cells under these circumstances as constraints to maximize biomass production as the objective function (flux balance analysis model 1 [FBA1]; described in "Materials and Methods"). The calculated specific growth rate was in good agreement with the experimentally determined one (Table II), validating the accurate determination of uptake/secretion flux in and out of cells and that no additional significant carbon assimilation reaction and secretion are missed during the cultivation. We also predicted the ideal growth rate of C. protothecoides in autotrophic and heterotrophic conditions with only the energy source specified (flux balance analysis model 2 [FBA2]; described in "Materials and Methods"). This model can explore the optimal carbon use. The result shows the potential of autotrophic and heterotrophic growth rates, which could be further improved 20% and 10%, respectively (Table II). According to the comparison of FBA1 and FBA2 (Supplemental Fig. S1), improved carbon metabolism can be realized by less carbon loss from the oxidative PP pathway and pyruvate dehydrogenase and increased carbon assimilation from phosphoenolpyruvate carboxylase as well. a Biomass components were comparable between two types of cells and applicable for additional calculation with the unmeasured dry weights less than 10% (8% and 5% of autotrophic and heterotrophic cells, respectively). b Nucleotide composition of DNA was obtained from genome sequencing, and the same composition was assumed for RNA.

Phenotype-Phase Plane Analysis Predicts Optimal Growth and Fatty Acid Synthesis under Autotrophic and Heterotrophic Conditions
Based on the reconstructed network model, we predicted the optimal growth rate and corresponding productivity of targeted fuel compounds (represented by the oleic acid, which is the most abundant fatty acid in C. protothecoides) using phenotype-phase plane (PhPP) analysis (Edwards et al., 2001) with the variation of input fluxes (light absorption and CO 2 uptake in autotrophic mode, and O 2 consumption and Glc uptake in heterotrophic mode). For the autotrophic model, the intensity of light input was represented by absorbed photon flux. The surface of a three-dimensional PhPP corresponding to the predicted maximal growth rate and fatty acid synthesis rate was plotted as a function of the uptake photon flux (0-3 mmol g dry cell weight 21 h 21 ) and the CO 2 uptake rate (0-0.3 mmol g dry cell weight 21 h 21 ; Fig. 4, A and B). It was simulated that the cells exhibited distinct phenotypes depending on the amounts of carbon fixation and light absorption. As shown, the rates for growth and fatty acid production are zero in Region I, where photon uptake is low. It is probable that light is insufficient to generate enough ATP for cell growth in this region. In Region II, the maximal cell growth is positively correlated with the light absorption; however, it is negatively correlated with the CO 2 uptake. Interestingly, inhibition of photosynthesis by elevated CO 2 was reported in other eukaryotic algae, such as C. reinhardtii (Yang and Gao, 2003) and Chlorococcum littorale (Satoh et al., 2001). In Region III, the growth rate is only limited by the CO 2 uptake because of light saturation. In economic considerations, the optimal growth and lipid accumulation appear on the demarcation line separating Regions II and III, with the ratio of photon uptake to CO 2 consumption rate around 8.4.
Similarly, prediction of the optimal growth rate and corresponding fatty acid synthesis rate in heterotrophic C. protothecoides was conducted with Glc and O 2 consumption rate (ranging from 0 to 0.9 mmol g dry cell weight 21 h 21 and from 0 to 1.2 mmol g dry cell weight 21 h 21 , respectively) as input variables (Fig. 4,C and D). The surface plot also shows three distinct regions. In Region I, the optimal growth rate is dependent exclusively on oxygen absorption, because oxidative phosphorylation could be repressed with low O 2 availability. This is comparable with the work by Chen et al. (2011), which showed that enhanced O 2 uptake could lead to higher growth rate of Escherichia coli. Cell growth and fatty acid biosynthesis are dependent on both variables protothecoides. Biomass components of major cellular macromolecules were expressed based on mass fraction of dry cell weight. Metabolite concentrations were expressed in moles per liter, and percentages of total concentrations are in parentheses. Amino acids include Glu, Arg, Ala, Asp, Gln, Asn, Lys, Pro, Thr, Val, Gly, Ser, His, Leu, Ile, Met, Tyr, Trp, and Phe. Nucleotides include ATP, ADP, and AMP. Central carbon intermediates are g-aminobutyric acid, (iso)citrate, malate, Glc/Fru-6-P, a-ketoglutarate, succinate, ribose-5-P, dihydroxyacetonphosphate, 6-phosphogluconate, phosphoenolpyruvate, fumarate, Hyp, and pyruvate. Redox cofactors are NAD + , NADH, NADP + , and NADPH, and coenzymes are acetyl-CoA and malonyl-CoA.
in Region II at the very beginning, after which the rates of cell growth and fatty acid synthesis cease with excess O 2 uptake. According to the PhPP model, one probable explanation is that excessive O 2 uptake leads to the weakening of the PP pathway and greatly enhanced activity of the tricarboxylic acid cycle, which oxidizes organic carbon to CO 2 . The situation continues in Region III, and algal cells fail to grow with low Glc availability and high O 2 absorption. Consequently, the optimal ratio of Glc uptake to O 2 consumption is 2.7 for heterotrophic growth and fatty acid synthesis, which could serve as a theoretical criterion for the cultural optimization of C. protothecoides in Glc-containing media.

Metabolite Concentrations in Autotrophic and Heterotrophic Cells
To validate the genome-based metabolic network of C. protothecoides experimentally, we took advantage of the LC-MS tools to quantitatively measure the network components: intracellular metabolites throughout the central carbon metabolism. To this end, we set up an LC-MS/MS program for metabolomics that coupled hydrophilic interaction and reversed-phase HPLC by electrospray ionization to triple-quadrupole MS/MS. The optimized MS parameters for each metabolite, including the limit of quantification, coefficient of determination (R 2 ), and relative SD, were listed in Supplemental Table S4. We also optimized the procedures for sample preparation, including cell harvesting, sample quenching, and solvent extracting. The extraction capacities of various solvent combinations were compared with a heat map analysis (Supplemental Fig. S2), and as a result, the methanol: water (50:50) mixture was shown to have major advantages for metabolite yields over other combinations.
To avoid the ion suppression caused by the matrix effect (Matuszewski et al., 1998;Annesley, 2003), which suppresses the signal of a compound in the sample relative to that of a standard at the same concentration and leads to the unreliability of determination, we used an isotope labeling approach proposed by Bennett et al. (2008) for metabolomic quantitation. Briefly, unlabeled standards in known concentrations were spiked into the extracts of cells that were completely labeled with isotopic substrate in the culture media. The amount of endogenous metabolite present in the cells was then determined by the ratio of endogenous metabolite to internal standard in the extract.
Identified metabolites with intracellular concentrations in both photoautotrophic and heterotrophic cells are summarized in Supplemental Table S5 with 95% confidence intervals. Of a total of 144 metabolites from the reconstructed draft metabolic network, 40 were identified and quantitated in the metabolome of C. protothecoides. The core metabolome, on a molar basis, consists of sugar phosphates (0.3% in autotrophic and 4.8% in heterotrophic mode), organic acids (7.7% in autotrophic and 11.3% in heterotrophic mode), amino acids (89.7% in autotrophic and 76.3% in heterotrophic mode), CoA (0.01% in autotrophic and 0.1% in heterotrophic mode), nucleotides (1.3% in autotrophic and 4.8% in heterotrophic mode), and reducing equivalents (0.9% in autotrophic and 2.4% in heterotrophic mode; Fig. 3, C and D). The most abundant metabolite in both nutritional patterns is Glu measured as 21.8 and 6.53 mM, respectively. Compared with metabolite concentrations between nutrient patterns, variations are present in the majority of metabolites (77.5% of which exhibit significant difference between autotrophic and heterotrophic cells; P , 0.05 by two-tailed Student's t test), especially for intermediates in glycolysis and the tricarboxylic acid cycle, in accordance with the substantial impact of nutrient mode on the metabolome composition.
Metabolome of C. protothecoides mirrors its metabolic features in response to nutrient changes. For example, with respect to the metabolites of reducing equivalents and bioenergy currency, the molar ratio of NADPH to its oxidative counterpart rises from 0.66 in CO 2 -fed cells to 1.75 in Glc-fed cells. The NADH-NAD + ratio also increases significantly from 0.003 in CO 2 -fed cells to 0.01 in cells fed with Glc. Also, intracellular concentration of ATP presents a 1.5-fold increase in heterotrophic growth, and the energy charge correspondingly increases from 0.9 to 0.92 according to ( . Together, these results are in good agreement with the notion that Glc-fed condition favors energyintensive metabolism, such as fatty acid biosynthesis (Hardie et al., 1999). This conclusion is further supported by altered levels of building blocks in the fatty acid pathway: 3-fold and as high as 97-fold increases are observed for acetyl-CoA and malonyl-CoA, respectively, in heterotrophic metabolome compared with the autotrophic one.

Flux Estimation Using Isotopically Nonstationary Metabolic Flux Analysis
To further validate the genome-based metabolic network in experimental conditions, we applied isotopically nonstationary metabolic flux analysis (INST-MFA; Young Table II. Experimentally determined and predicted specific growth rate based on growth parameters of autotrophic and heterotrophic C. protothecoides Determined data were represented as mean 6 SD. m, Specific cell growth rate.  Figure S3. For autotrophic cells, 13 C-bicarbonate uptake resulted in prompt labeling of metabolites involved in the Calvin-Benson cycle (i.e. 3-phosphoglycerate, erythrose-4-P, and Glc-6-P/Fru-6-P), whereas there was relatively slower turnover of the tricarboxylic acid cycle intermediates (i.e. citrate/isocitrate, succinate, fumarate, and malate), reflecting differentiated flux distribution among pathways. Note that the tricarboxylic acid cycle metabolites, such as (iso)citrate and succinate, were slowly labeled as above 50% of the unlabeled (M0) isotopomer that remained 60 min postlabeling. In contrast to this, sugar phosphates, such as 3-phosphoglycerate, have a short half-labeling time around 8 min. With respect to heterotrophic cells, feeding uniformly labeled 13 C-Glc to cells led to faster labeling of sugar phosphates (i.e. 3-phosphoglycerate, erythrose-4-P, and Glc-6-P/Fru-6-P), reflecting high activity of glycolytic and the PP pathway. . The predicted maximal growth rate (A) and corresponding fatty acid production rate (B) as a function of light intensity and total CO 2 uptake, respectively, under photoautotrophic growth. The predicted maximal growth rate (C) and fatty acid production rate (D) as a function of Glc and O 2 uptake, respectively, under heterotrophic growth. The fatty acid production rate is represented by that of oleic acid, which is the most abundant fatty acid in the C. protothecoides biomass. The surface is colored relative to the value of the z axis. gCDW, Gram cell dry weight.
In terms of metabolites in the tricarboxylic acid cycle, one interesting finding is the identical labeling pattern of malate and fumarate. Prompt appearance of the three carbon-labeled (M3) isotope was observed for both of them, indicating the strong activities of malic enzyme or phosphoenolpyruvate carboxylase, which convert labeled (phosphoenol)pyruvate and unlabeled CO 2 to organic acids and thus, shortcut the tricarboxylic acid cycle from the oxidative direction. Based on least squares regression, by which the residuals between experimentally determined and simulated isotopomer distributions are minimized, we generated a quantitative flux map of C. protothecoides as displayed in Figure 5. Net fluxes are normalized to a CO 2 fixation rate of 100 by Rubisco flux (0.154 mmol g dry cell weight 21 h 21 ) in autotrophic cells and Glc uptake rate of 100 in heterotrophic cells (Glc uptake rate: 0.303 mmol g dry cell weight 21 h 21 ). The metabolic activity of autotrophically growing cells is mainly derived from the chloroplast as a generator of energy and organic compounds. The energy needed for autotrophic metabolism mainly comes from the photophosphorylation in chloroplast, and glyceraldehyde-3-P, the product of photosynthesis, is transferred into the cytoplasm and serves as a precursor for the biosynthesis of biomass components (Fig. 5A). It is noticed that a much smaller flux through succinate was calculated compared with the average metabolic flow in the upstream and downstream of this node, suggesting a low respiratory activity for succinate dehydrogenase under illumination. This result is consistent with the notion that the major function of the tricarboxylic acid cycle in photosynthetic microbes is in biosynthesis instead of generating energy (Pearce et al., 1969;Steinhauser et al., 2012;Schwarz et al., 2013). According to the computation results, photorespiration, the oxygenation process of Rubisco (a bifunctional enzyme catalyzing the first step of CO 2 fixation in the Calvin-Benson cycle), functions in autotrophic growth but merely accounts for 0.1% of the total Rubisco activity (Fig. 5A). Although photorespiration flux is measured low in photosynthetic microbes, such as cyanobacteria (Young et al., 2011), it is usually thought to be indispensable as glyoxylate, the intermediate in this pathway that is required for Gly and Ser biosynthesis (Knoop et al., 2010). In our reconstruction, these two amino acids can be synthesized elsewhere  Meanwhile, high concentration of bicarbonate added for the 13 C labeling experiment could further suppress photorespiration (Huege et al., 2011;Young et al., 2011).
Flux map for heterotrophic cells was based on the Glc catabolism, because C. protothecoides grew upon U-13 C Glc as the sole carbon source. Most of the carbon flux for heterotrophic growth is directed through the glycolytic pathway (Fig. 5B). High activity of the C4 pathway is also observed. Consistent with the rapid accumulation of malate M3 isotopomer, malic enzyme channels 36% of the flux through phosphoenolpyruvate. The glyoxylate shunt (isocitrate lyase encoding gene Cpr004092.1 and malate synthase encoding gene Cpr001903.2) was included in our model while measured to be inactive in heterotrophic cells, which agrees with our previous study (Xiong et al., 2010a(Xiong et al., , 2010b. Another noteworthy result is the high use rate of acetyl-CoA for fatty acids in heterotrophically grown C. protothecoides, exhibiting 33-fold higher flux than that in autotrophic growth (Fig. 5B). Interestingly, acetyl-CoA inputs into the tricarboxylic acid cycle also enhance greatly. In heterotrophic conditions, the activity of the tricarboxylic acid cycle depending on acetyl-CoA oxidation arises to supply the energy and reducing equivalents required for biomass formation, including the biosynthesis of fatty acids. Because acetyl-CoA is also presumably used as the building block for fatty acid synthesis, an appropriate split ratio of metabolic flux should be assigned into these two pathways for the carbon and cofactor balancing. According to our flux map, the ratio of pyruvate used for fatty acid synthesis to that directed through the tricarboxylic acid cycle is assigned as 6.8:1 in heterotrophic Chlorella spp. and correspondingly, 3.4:1 in autotrophic cells, suggesting a flexible and tunable flux distribution for fatty acid synthesis. Feasible strategies to optimize this ratio may include inhibition of citrate synthase, the first enzyme in the tricarboxylic acid cycle. It was reported to cause the accumulation of acetyl-CoA, which could be donated for fatty acid synthesis (Taylor, 1973;Coleman and Bhattacharjee, 1975;Underwood et al., 2002). Meanwhile, alternative pathways generating the reducing equivalents are desired for fatty acid synthesis.

Cofactor Balance Analysis
Based on the quantitative results of 13 C MFA, the production and consumption of cofactors ATP and NAD(P)H can be analyzed in autotrophic and heterotrophic C. protothecoides. As shown in Table III, photosynthesis is principally responsible for the turnover of cofactors in autotrophic cells, because 89% ATP and 78.6% NADPH are directly derived from the light reactions and consumed for the generation of glyceraldehyde-3-P. The rest of cofactors are used in biomass synthesis for cell growth. For example, fatty acid synthesis consume 1.1% and 4.1% of the total ATP and NAD(P)H produced, respectively. In heterotrophic cells, glycolysis and tricarboxylic acid cycle are the main contributors for the formation of ATP. Note that maintenance accounts for a great percentage of ATP drain according to the quantitative model. Reducing equivalents are concomitantly produced during the formation of 3-phosphoglycerate and the decarboxylation catalyzed by pyruvate dehydrogenase as well as other reaction process in the tricarboxylic acid cycle. The abundantly formed NAD(P)H is mainly used for ATP generation through the respiratory electron transport chain and the formation of biomass components, especially the NADPH-intensive fatty acid synthesis [computationally 13.6% and 53.1% of the total ATP and NAD(P)H produced, respectively] in heterotrophic C. protothecoides.

DISCUSSION
Elucidation of metabolic properties and functions in a systemic level can be greatly facilitated by metabolic network modeling that integrates the genome-annotated enzymatic reactions and computational approaches to a functional entity. Most of the current reconstructed networks are assessed using FBA, which uses a linear programming approach and provides a metabolic flux distribution consistent with the optimal solution of the objective function. In this study, we reconstructed the primary metabolic network of C. protothecoides on the basis of genome information, focusing on central metabolism comprised by the Calvin-Benson cycle, glycolysis, the PP pathway, the tricarboxylic acid cycle, and the biosynthetic pathways of biomass building blocks. The model was operated with in silico calculations, which were in reasonable agreement with the measured growth rates. The application of FBA led to several conclusions: (1) completeness of metabolic functionality mined in the genome information (few gaps in the certain pathways were identified and filled, whereas only four putative enzymes missed), (2) achievement of suboptimal growth by autotrophic and heterotrophic cells, and (3) optimality of metabolic parameter for cell growth and fatty acid production. The objective function that we selected in the model, to maximize biomass formation and the inlet and outlet carbon flux through the autotrophic and heterotrophic C. protothecoides (e.g. photon uptake and CO 2 uptake rate of autotrophic cells and the CO 2 excretion and Glc uptake of heterotrophic cells), can be determined. Our modeling defined by carbon influx (FBA1) resulted in agreement of calculated cell growth rates with experimental determined ones, thus validating the accuracy of our approach. FBA was then applied multiple times by varying growth conditions, and the output of objective function formed the PhPP plot, which mirrors metabolic potential of C. protothecoides. Overall, this model represents the first metabolic network draft, to our knowledge, for oleaginous microalga C. protothecoides reassembled from the newest availability of genome information. Upon presented framework, higher-resolution metabolic mapping is expected through extensive refinery of genomic information.
Despite its wide application in metabolic network analysis, the limitation of FBA should be taken into account in that solution space needs to be further constrained to approach real flux values. To address this, we adopt two experimental strategies. (1) Validation of C. protothecoides metabolism by the metabolomic analysis. According to the absolutely quantitative measurement of metabolite concentrations using LC-MS, we compared the metabolite levels of C. protothecoides in different nutrients. Heterotrophic C. protothecoides exhibits high levels of ATP, NAD(P)H, acetyl-CoA, and malonyl-CoA, supporting a superior lipid synthesis over autotrophic cells and being in line with the FBA modeling results from the reconstructed network. (2) Application of FBA framework in nonstationary 13 C MFA. For this purpose, INST-MFA was, for the first time to our knowledge, used for a eukaryotic alga, and the best estimates of intracellular fluxes were obtained. To investigate the difference of real fluxomics and optimal flux status, we compared the results of INST-MFA estimation with FBA predictions. Shown in Supplemental Figure S1 are the net fluxes resulted from FBA and INST-MFA. Taking autotrophic cells for example, INST-MFA led to similar flux distributions as FBA models. However, the main differences include a less active tricarboxylic acid cycle detected by INST-MFA, consistent with slower labeling of the tricarboxylic acid intermediates. Note that, generally, the FBA model performs optimization of linear systems with seldom experimental inputs, whereas nonstationary 13 C MFA is further defined by ordinary differential equations aiming to fit kinetic 13 C data, thus leading to a more reliable flux map closer to reality. According to the results from FBA1 and 13 C MFA, differences of final flux values are in a few reactions in the PP pathway, photorespiration, glyoxylate shunt, and tricarboxylic acid cycle. Higher accuracy could be further realized if more dynamic isotope labeling trajectories of central carbon metabolites can be included in the system, and the genomebased FBA model, thus, offered an open framework for improved flux estimate. We also exploit this model by fitting it with the steady-state 13 C labeling data generated in our previous work (Xiong et al., 2010a(Xiong et al., , 2010b. A noncompartmented and simplified model was used to investigate fluxes of heterotrophic cells, because no detailed genomic information was available at that time. Applying the isotopomer distribution data of proteinogenic amino acids to this model generates higherresolution flux maps (data not shown). Compared with flux profiling defined by kinetic labeling data (Fig. 5B), flux distributions throughout primary pathways are fairly similar, indicating data consistency. Remaining minor discrepancies are probably caused by changed experimental conditions, different labeling strategies, and a refined flux-estimating algorithm. Nevertheless, it should be noted that our research provided a case study showing the feasibility of both stationary and nonstationary 13 C flux analyses based on a large-scale compartmentalized metabolic network. These integrated studies of metabolic network coupled with metabolomics analysis shed light on the metabolism of oleaginous microalga C. protothecoides and may serve as a cutting-edge toolbox for the systematic engineering of microalgae for biofuel production.

Abbreviations and Nomenclature
Abbreviations and nomenclature used in the figures and supplemental materials are listed in Supplemental Table S6.

Cultivation Conditions and Analytical Procedure
Microalga Chlorella protothecoides 0710 was provided by the Culture Collection of Alga at the University of Texas. The basic medium composition was the same as previously described (Xiong et al., 2008). Briefly, for autotrophic cultivation, 2 g L 21 NH 4 Cl was added to the basic medium as a nitrogen source with four 18-W cool-white fluorescent lamps (Philips) providing an average surface illumination intensity of 1,470 lux (namely, 19.9 mmol m 22 s 21 with the conversion factor of 0.0135); 10 g L 21 Glc and up to 1 g L 21 NH 4 Cl were used as carbon and nitrogen sources, respectively, for heterotrophic growth in a 1-L fermenter (Infors).
An Ultrospec 2000 UV/Visible Spectrophotometer (Pharmacia Biotech) was used to monitor alga growth by measurements of optical density at 540 nm (Xiong et al., 2010a(Xiong et al., , 2010b. Initial and residual Glc in the media during the cultivation process were determined by a BSA-40C enzymatic bioanalyzer (Shangdong Academy of Science). CO 2 consumed or secreted by autotrophic and heterotrophic algae was calculated from the difference between inlet and outlet CO 2 concentrations determined by a real-time tandem gas analyzer (Milligan Instrument).
To determine the macromolecular composition of C. protothecoides, the Lowry method (Holdsworth et al., 1988) was used to measure protein content, and amino acid composition was obtained with an L-8800 Amino Acid Analyzer (Hitachi); the content and composition of oil were measured by gas chromatography (GC)-MS as previously described (Xiong et al., 2010b). The 3,5-dinitrosalicylic acid method (Miller, 1959) was applied to determine intracellular carbohydrate and starch. The percentage of nucleic acid was determined according to the diphenylaminecolorimetric method (Gendimenico et al., 1988) and the method proposed by Benthin et al. (1991); chlorophyll in autotrophic algae was measured using the improved N,N-dimethylformamide extraction method according to Pan et al. (2001).

Primary Metabolic Network Reconstruction
The primary metabolic network is a foundational model focusing on the central metabolism of C. protothecoides. The reconstruction work began with the genome annotation of C. protothecoides, the genome that was recently sequenced by us (Gao et al., 2014). Textbook knowledge and constructed metabolic networks of other photosynthetic microbes, including Synechocystis sp. PCC 6803 and Chlamydomonas reinhardtii (Shastri and Morgan, 2005;Boyle and Morgan, 2009;Manichaikul et al., 2009;Knoop et al., 2010;Chang et al., 2011;Hädicke et al., 2011), were also referenced. An original list of reactions and corresponding enzymes was collated according to the pathway maps in the Kyoto Encyclopedia of Genes and Genomes (Kanehisa and Goto, 2000).
Each reaction in the network was manually checked, and those genes with hypothetical protein product were reannotated using the online blast program of UniProt (http://www.uniprot.org/blast/) and confirmed with conserved domain analysis (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). Reaction gaps resulting from the incompleteness of genome annotation were identified. Missing genes were searched for in other species of Chlorophyta (C. reinhardtii, Ostreococcus tauri, and Volvox carteri ), Arabidopsis (Arabidopsis thaliana), or Saccharomyces cerevisiae, and the known nucleotide or amino acid sequences were blasted against the C. protothecoides genome database. Most of the missing enzymes had hits with acceptable e values, except for four essential enzymes, which were retained in the stoichiometric model for the completeness and functionality of the network. Reactions in a pathway without branch were lumped (e.g. routes in the biosynthesis of macromolecules) to simplify the model calculation, which was still sufficient to detail the network.
To model the biomass formation reactions, we determined the biomass components and macromolecular composition of autotrophic and heterotrophic C. protothecoides (Table I) consisting of proteins, lipids, carbohydrates, DNA and RNA, and chlorophyll in autotrophic algae. GC content of DNA was obtained from genome sequencing, and the same GC content was assumed for RNA. Lipids were represented by diacylglycerol, and fatty acids were subdivided to tetradecanoic, hexadecanoic, and octadecanoic acids as well as unsaturated fatty acids octadecenoic (C18:1) and octadecadienoic acid (C18:2). Genes related to the nonpolar lipid synthesis of phosphatidylethanolamine, phosphatidyl-Ser, phosphatidylcholine, phosphatidic acid, phosphatidylinositol, phosphatidylglycerol, sulfoquinovosyl diacylglycerol, monogalatosyl diacylglycerol, digalactosyl diacylglycerol, and ceramide were also identified, whereas the reactions were simplified in the flux computation. Because no reliable experimental data were available, ATP demand for maintenance (EC 2.7.4.1: Cpr003988.1) and transhydrogenase (EC 1.6.1.2: Cpr004180.1) was not specified to balance the energy generation and consumption of C. protothecoides in each scenario.
Enzymes present in the network were localized into four cellular compartments (cytoplasma, mitochondria with matrix and intermembrane space [Grant and Hommersand, 1974], chloroplast with stroma and thylakoid lumen [Gao et al., 2014], and peroxisome [Codd et al., 1972;Chou et al., 2008]) primarily based on literature evidence of C. protothecoides and enzyme homologs in other species of Chlorella spp. (Chlorella pyrenoidosa, Chlorella fusca, and Chlorella sorokiniana), C. reinhardti, and Arabidopsis supplemented by PredAlgo, a multisubcellular localization predictor for algae (Tardif et al., 2012). In the absence of any evidence for localization, enzymes were assigned according to neighboring reactions for the functional feasibility of the network.
The actual photon absorption flux of autotrophic C. protothecoides by lightharvesting complexes for metabolic use is difficult because much of the incident light was reflected or scattered before it could generate the light reaction. Here, we applied a simplified approach proposed by Manichaikul et al. (2009) to estimate the photon uptake flux. The conversion rate of incident light to photon absorption flux is assumed to be constant, which is also practicable for the photon saturation point. The minimal light uptake that is sufficient for photosynthetic saturation measured as O 2 evolution was calculated to be 37.72 mmol g dry cell weight 21 h 21 when autotrophic C. protothecoides grows at its maximal specific growth rate that has ever been reported (Sorokin and Krauss, 1959). The experimental photon flux saturation point was 500 mmol m 22 s 21 (Ouyang et al., 2010). Therefore, the actual flux of photon uptake can be estimated with experimentally measured light radiation.
The cyclic and noncyclic electron transport chains in photosynthesis and the ATP synthesis reaction were adopted according to Shastri and Morgan (2005). As for enzymes for which the electron donors and acceptors are unknown, NADH was assigned as the final donor, and NAD was assigned as the final acceptor (Knoop et al., 2010). NAD(H) and NADP(H) were integrated in one reaction for which both were available cofactors, and the stoichiometric numbers were set equal to minimize the consequence on FBA results.

Growth Simulation
FBA was applied in the quantification of the stoichiometric model of metabolic network using INCA 1.1 (Young, 2014) and CellNetAnalyzer 9.9 (Klamt et al., 2007) on MATLAB 7.6 (Mathworks). To simulate the growth state of C. protothecoides in autotrophic and heterotrophic conditions, we fixed photon uptake (set to zero for heterotrophic metabolism), Glc uptake (set to zero for autotrophic metabolism), and CO 2 uptake/excretion rates as constraints and maximized the growth rate as the objective function under the assumption that algal cells were striving to achieve a maximal biomass production, an assumption that is reasonable for cells in the exponential growth phase (FBA1). Other constraints used for simulations are reported in Supplemental Table S7. We also made the prediction with only energy source uptake flux (photons of the autotrophic model and Glc of the heterotrophic model) constrained to predict the ideal growth of C. protothecoides (FBA2), which allowed for the optimized carbon use and distribution within cells. Other constraints were the same as those of FBA1.

Metabolite Extraction
Extraction of metabolites in C. protothecoides was according to Bennett et al. (2008) with slight modification. A 2-mL cell culture was rapidly harvested by filtering with a 47-mm-diameter round hydrophilic nylon filter (Sartorius Nylon Membrane, 0.45 mm) and quickly transferred into a prechilled dish containing 5 mL of 270°C methanol for metabolism quenching. The dishes were stored in a 270°C freezer for 2 h; then, the filters were rinsed, and the cell suspension was centrifuged. The pellets were extracted with 500 mL of methanol:water (50:50) mixture three times (various extract solvent combinations were tested, and finally, a methanol:water [50:50] mixture was selected). The supernatant was collected and combined with the remaining methanol, and it was evaporated by centrifugation under a vacuum. The resulting pellets were redissolved in 600 mL of LC mobile phase (see below) and stored at 280°C until LC-MS analysis.
All standards and reagents for LC-MS analysis were purchased from Sigma-Aldrich and Acros Organics with purity $ 95%. Chromatographic peaks from samples were identified through comparison of retention times and mass spectra with those of standards. The lowest concentration allowed for quantification has a signal to noise ratio of 10. For comparing the extract capacity with different solvent combinations, clustering was carried out with freely available software Cluster 3.0 after normalization of the means of peak areas of the replicates against data of the 6-h sample and visualized with a heat map using TreeView 1.60 (Eisen et al., 1998).

Absolute Quantitation of Intracellular Metabolite Concentrations
Heterotrophic C. protothecoides was inoculated into basic medium with [U-13 C 6 ]Glc (99% isotopic purity; Cambridge Isotope Laboratories), and the handling was repeated two times before the 13 C-labeled cells were transferred into medium containing 1 g L 21 [U-13 C 6 ]Glc and 0.1 g L 21 NH 4 Cl. The cell cultures in early log phase were harvested as previously described with and without unlabeled standards spiked, and the quantitation was performed using the ratio of the 13 C peak height to the 12 C peak height obtained from MS data . Because of the triple-quadrupole MS scan time, monitoring every isotopic form in 13 C-fed cultures was not feasible, and therefore, equivalent unlabeled cultivations were analyzed to evaluate the completeness of labeling. The absolute intracellular metabolite concentrations were calculated according to the following formula: where R avg is the geometric mean of the ratio of the fully labeled peak intensity to that of the fully unlabeled peak (including spiked unlabeled standards) from three replicates, L avg is the geometric mean of the ratio of the fully labeled peak in cells from labeled media to the fully unlabeled peak in cells from unlabeled media, Z avg is the geometric mean of the ratio of fully unlabeled to fully labeled from cells fed with 13 C Glc and extracted without spiking any standard, M is the amount of standards added to cell extract, and F h is the total cell volume. For determination of the absolute concentrations in autotrophic algae, an equivalent of metabolite extract from green cells was mixed with that from the 13 C-labeled heterotrophic culture, and the concentrations were calculated by: where R ' avg is the geometric mean of the ratio of the fully labeled peak intensity to that of the fully unlabeled peak (including spiked unlabeled compounds from autotrophic algae), and F a is the corresponding autotrophic cell volume. SE was considered by error propagation, and the final intracellular metabolite concentrations with 95% confidence intervals were calculated and reported.

C Labeling Experiment
An elementary metabolite unit-based (Antoniewicz et al., 2007) INST-MFA was used to estimate intracellular metabolic fluxes. It uses the isotope labeling kinetics of intracellular metabolites (Young et al., 2008). Exponentially growing cells were then collected by centrifugation and resuspended in basic media with no carbon or nitrogen source for 5 min. Time 0 sampling was conducted using the harvesting and extracting method previously described. The broth was transferred into a 500-mL SEBC bottle (Fisher Scientific) with BOLA Multiple Distributor in a GL 45 cap (BOLA). The bottle was sterilized with a one-way valve fitted to the cap to exhaust residual gas. The valve was then sealed to prevent unlabeled CO 2 from entering the bottle. An 8-mL aliquot of 0.5 M NaH 13 CO 3 (99% isotopic purity; Cambridge Isotope Laboratories) was injected into the bottle through a rubber septum in the cap. pH was maintain at around 7.5 by the addition of 1 M H 2 SO 4 . The broth was sampled at time points of 30 s, 90 s, 5 min, 10 min, 30 min, 60 min, and 120 min using a syringe. C-switching of the heterotrophic cells was achieved similarly by substituting uniformly 13 C-labeled Glc with unlabeled carbon source, and the broth was sampled at the same time points.

MFA Based on Dynamic Isotopomer Distributions
After samples were analyzed by LC-MS, the isotopomers patterns of metabolites were monitored simultaneously (Supplemental Fig. S3). The isotopomer distributions, however, reflected mixed extracting pools of the same metabolites located in different organelles because of the compartmentalization of Chlorella spp. cells. To address this, pseudo-fluxes were introduced to simulate the pool mixing, which participates in the isotopomer balance but has no effect on the mass balance. The pseudo-fluxes were also used to account for the pool mixing of Glc-6-P and Fru-6-P, dihydroxyacetonphosphate and glyceraldehyde-3-P, and citrate and isocitrate, which were coquantitated. Dilution effect caused by the mixing of slowly labeled or metabolically inactive pools and the turnover of macromolecules was also considered with pseudo-fluxes on the principle of Isotopomer Spectral Analysis (Kelleher and Masterson, 1992). To reduce computation load and speed up the computation time, a condensed metabolic network of C. protothecoides was acquired by lumping the formation reactions of macromolecular building blocks with biomass produced directly from intermediates of primary metabolism (Supplemental Table S2). Then, the INST-MFA was applied, and the sum of squared residuals between the kinetics of isotope labeling of measured metabolites and the computationally simulated isotopomer distributions was minimized to obtain the best estimates of metabolic fluxes (other constraints were the same with FBA models) followed by calculation of SEs for all estimated fluxes (Young et al., 2011). Calculation was repeated 10 times using INCA 1.1 starting from random initial guess in each scenario to obtain the global optimum. All flux estimates and SEs are presented in Supplemental Table S8.
To do the cofactor balance analysis, we summarized all of the concomitant formation and consumption of ATP and NAD(P)H (NADH and NADPH were combined because of the uncertainty as cofactors of dehydrogenases and reductases) according to the INST-MFA results and categorized them into relevant pathways as shown in Table III. Sequence data from this article can be found in the GenBank/EMBL data libraries under accession number APJO00000000.

Supplemental Data
The following supplemental materials are available.
Supplemental Figure S1. Heat map of the calculated net fluxes in comparison of 13 C MFA, FBA1, and FBA2.
Supplemental Figure S2. Heat map of metabolite quantification reflecting the extract capacity with different solvent combinations.
Supplemental Figure S3. Kinetics of isotope labeling of measured metabolites.
Supplemental Table S1. Unannotated enzymes confirmed using a protein query tblastn search.
Supplemental Table S2. List of reactions, catalyzing enzymes, and localization in the primary metabolic network of C. protothecoides.
Supplemental Table S3. Gene-protein association of the enzymes and transporters involved in the network.
Supplemental Table S4. MS parameters and method performance for metabolites.
Supplemental Table S5. Absolute intracellular metabolite concentrations of C. protothecoides in a 95% confidence interval.
Supplemental Table S6. Abbreviations and nomenclature of metabolites.
Supplemental Table S7. Modeling constraints used in growth simulations with FBA.
Supplemental Table S8. Net fluxes with SEs determined by INST-MFA.
Supplemental Text S1. Autotrophic model of the C. protothecoides metabolic network presented in SBML format.
Supplemental Text S2. Heterotrophic model of the C. protothecoides metabolic network presented in SBML format.