Determination of residues responsible for substrate and product specificity of Solanum habrochaites short-chain cis- prenyltransferases

One Sentence Summary: The relative positions of aromatic amino acids and adjacent residues within domain II of short-chain cis- prenyltransferases contributes to the evolution of volatile terpene biosynthesis in Solanum trichomes. ABSTRACT Isoprenoids are diverse compounds that have their biosynthetic origin in the initial condensation of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to form C10 prenyl diphosphates that can be elongated by the addition of subsequent IPP units. These reactions are catalyzed by either cis -prenyltransferases (CPTs) or trans -prenyltransferases (TPTs). The synthesis of volatile terpenes in plants typically proceeds through either geranyl diphosphate (C10) or trans -farnesyl diphosphate (C15), to yield monoterpenes and sesquiterpenes, respectively. However, terpene biosynthesis in glandular trichomes of tomato ( Solanum lycopersicum ) and related wild relatives also occurs via diphosphate Z -FPP). NPP and Z , Z -FPP are synthesized by neryl diphosphate synthase 1 (NDPS1) and Z , Z -farnesyl diphosphate synthase (zFPS), which are encoded by the orthologous CPT1 locus in tomato and Solanum habrochaites , respectively. In this study, comparative sequence analysis of NDPS1 and zFPS enzymes from S. habrochaites accessions that synthesize either monoterpenes or sesquiterpenes was performed to identify amino acid residues that correlate with the ability to synthesize NPP or Z , Z -FPP. Subsequent structural modeling, coupled with site-directed mutagenesis highlighted the importance of four amino acids located within conserved domain II of CPT enzymes that form part of the second alpha helix, for determining substrate and product specificity of these enzymes. In particular, the relative positioning of aromatic amino acid residues at positions 100 and 107 determines the ability of these enzymes to synthesize NPP or Z , Z -FPP. This study provides insight into the biochemical evolution of terpene biosynthesis in the glandular trichomes accessions of an NDPS1 enzyme rather than zFPS. In the present study, cDNAs corresponding to the CPT1 locus were isolated from the trichomes of chemically diverse S. habrochaites accessions and comparative sequence analysis, together with homology modeling were employed to identify specific amino acid residues that correlate with either NDPS1 or zFPS activity. The role of these residues in determining substrate and product specificity of short-chain CPTs was confirmed through site-directed mutagenesis, revealing an essential role for the relative positions of aromatic amino acids within a hydrophobic cleft between 80 µM [ 14 C] IPP at 150 µM NPP and 10 - 100 µM NPP at 50 µM [ 14 C] IPP; NDPS1-M4, 10 - 100 µM DMAPP at 50 µM [ 14 C] IPP, 20 - 80 µM [ 14 C] IPP at 150 µM NPP and 10 - 100 µM NPP at 50 µM [ 14 C] IPP; zFPS-LA1393, 10 - 100 µM DMAPP at 50 µM [ 14 C] IPP, 20 - 80 µM [ 14 C] IPP at 150 µM NPP and 10 - 100 µM NPP at 50 µM [ 14 C] IPP; zFPS-M10, 20 - 80 µM [ 14 C] IPP at 150 µM DMAPP, 10 - 100 µM DMAPP at 50 µM [ 14 C] IPP; zFPS-M2, 20 - 80 µM [ 14 C] IPP at 150 µM DMAPP and 10 - 100 µM DMAPP at 50 µM [ 14 C] IPP. All reactions were performed in a 50 µl volume containing 50 mM HEPES, 100 mM KCl, 7.5 mM MgCl 2 , 5% (v/v) glycerol, and 5 mM DTT (pH 8.0) and were incubated at 30ºC for 10 to 15 min. The reaction was stopped by addition of 50 µl of 1N HCl and hydrolysis of the reaction products allowed to proceed at 37°C for repellents whiteflies. Gene network reconstruction identifies the authentic trans -prenyl diphosphate synthase that makes the solanesyl moiety of ubiquinone-9 in Arabidopsis. 17-hydroxygeranyllinalool diterpene glycosides,


INTRODUCTION
determine the final chain length of isoprenoids which typically ranges between C15 and C120, but can be greater than C10,000 in the case of natural rubber (Takahashi and Koyama, 2006).
TPTs and CPTs are distinct and operate through different catalytic mechanisms (Takahashi and Koyama, 2006;Liang, 2009). Several plant TPTs have been characterized including those involved in the synthesis of GPP, 2E,6E-farnesyl diphosphate (E,E-FPP), geranylgeranyl diphosphate (C20) and solanesyl diphosphate (C45) that serve as the precursors of terpenes, sterols, hormones and carotenoids (Burke et al., 1999;Hirooka et al., 2003;Lange and Ghassemian, 2003;Ducluzeau et al., 2012). Similarly, plant CPTs involved in polyisoprenoid biosynthesis have been identified, including those involved in the synthesis of rubber (Cunillera et al., 2000; helices II and III. These data provide insight into the biochemical evolution of trichomederived specialized metabolites that function as insect deterrents.

Identification and characterization of NDPS1 from S. habrochaites
To identify sequences encoding NDPS1 enzymes from S. habrochaites and to begin to address the evolutionary relationship of NDPS1 and zFPS, cDNAs from the CPT1 locus were amplified and sequenced from multiple accessions of S. habrochaites that predominantly synthesize either monoterpenes or sesquiterpenes from the cisoid pathway. In total, 25 cDNAs were recovered from 17 accessions that share between 97 and 100% identity at the nucleotide level. Phylogenetic analysis revealed that these sequences group into two major clades (Fig. 2). One clade is defined by the presence of the previously characterized zFPS gene from S. habrochaites accession LA1777 (Sallaud et al., 2009) and the second, containing fewer sequences, is more closely related to NDPS1 from S. lycopersicum (Schilmiller et al., 2009).
The phylogeny of the S. habrochaites CPT1 sequences suggest that several may encode enzymes with NDPS1 activity and this hypothesis was tested by expressing a codon optimized synthetic version of CPT1 from S. habrochaites accession LA2409, a monoterpene producing accession (Gonzales-Vigil et al., 2012), in E. coli and assaying the activity of the purified recombinant enzyme. Upon incubation with DMAPP and IPP at equimolar concentrations, the recombinant enzyme synthesized the C10 isoprenoid NPP that was detected as nerol following dephosphorylation, indicating that the enzyme possesses NDPS1 activity (Fig. 3A). Furthermore, as the ratio of IPP to DMAPP was increased, the product specificity of the enzyme remained unchanged. In contrast, purified recombinant enzyme derived from a codon optimized version of CPT1 from S. habrochaites accession LA1393, a sesquiterpene producing accession that is identical to the previously characterized zFPS enzyme from S. habrochaites accession LA1777 (Sallaud et al., 2009;Gonzales-Vigil et al., 2012), preferentially synthesized the C15 product, Z,Z-FPP (Fig. 3A). The activities of CPT1 from LA2409 and LA1393 were also determined utilizing NPP and IPP as substrates. As previously documented, zFPS is able to synthesize Z, Z-FPP from NPP and IPP (Sallaud et al., 2009) although CPT1 from LA2409 is unable to utilize this substrate combination (Fig. 3B). These data indicate that CPT1 from S. habrochaites accession LA2409 encodes an enzyme with NDPS1 activity.

Identification of amino acid residues important for the activity of NDPS1 and zFPS
The predicted protein sequences of NDPS1 encoded by CPT1 from accession LA2409 and zFPS encoded by CPT1 from accession LA1393 are highly similar to one another, sharing 94% identity and 97% similarity over the entire length of their sequence, which corresponds to 18 amino acid differences. A comparative sequence approach was utilized to further refine the identity of the amino acids responsible for specifying either NDPS1and zFPS activity. Amino acid alignments of putative NDPS1 and zFPS proteins of S. habrochaites were investigated for conserved residues that correlate with both the ability of the parent accession to synthesize either monoterpenes or sesquiterpenes ( Fig. 2), and in the case of putative S. habrochaites NDPS1 proteins, also possess an identical residue at the corresponding position in S. lycopersicum NDPS1. In total, 11 residues that reside within the predicted mature protein of NDPS1 and zFPS matched these criteria and these were generally spaced throughout the protein sequence but several were located within conserved domains II, III, and V of CPTs (Table I; Supplemental Fig. S1). This analysis was followed by changing 10 residues that correlate with changes in substrate in NDPS1 from accession LA2409 to those found in zFPS from accession LA1393 (Pro/Ser45, which corresponds to the second residue in the predicted mature 1 0 coincided with a greatly reduced ability to synthesize NPP. Similarly, the zFPS-M10 recombinant protein lost the ability to synthesize Z,Z-FPP but gained NDPS1 activity and synthesized NPP at levels similar to NDPS1 from accession LA2409 (Fig. 4A).
Furthermore, substitution of the M10 residues also changed the substrate specificity of NDPS1 and zFPS with respect to their ability to utilize NPP. For example, the NDPS1-M10 variant gained the ability to synthesize Z,Z-FPP from NPP and IPP, whereas this activity was lost in the zFPS-M10 variant (Fig. 4B). Together, these data illustrate that substitution of the 10 conserved amino acids influenced the substrate and product specificities of trichome expressed short chain CPTs and were sufficient to interchange NDPS1 and zFPS activities.

Homology modeling of NDPS1 and zFPS
To attempt to refine which subset of the M10 amino acids determine the substrate and product specificities of NDPS1 and zFPS, homology models were generated to visualize the relative positions of these amino acids within three-dimensional space (Fig. 5).
Currently, there are no available crystal structures of plant CPTs. However, BLAST searches of the protein data bank database revealed that NDPS1 and zFPS, without the chloroplast transit peptide, each share approximately 40% amino acid identity with an E. coli undecaprenyl pyrophosphate synthase (PDB: 1X07A), which catalyzes the formation of a C55 undecaprenyl pyrophosphate involved in peptidoglycan synthesis in the bacterial cell wall (Guo et al., 2005). Homology models of NDPS1 and zFPS were obtained using the SWISS-MODEL Workspace with QMEAN z-score values of -2.41 and -2.20 for NDPS1 and zFPS, respectively, which represent acceptable models.
In general, the models of NDPS1 and zFPS aligned to the template with reasonable overlap of the predicted helices observed ( Fig. 5A and B). However, the third alphahelix of both NDPS1 and zFPS is considerably shorter than that observed in the E. coli undecaprenyl pyrophosphate synthase and instead of an extended helix structure, the sequence of both NDPS1 and zFPS form disordered loops. The dimension of the hydrophobic cleft formed between helices II and III is known to influence the product chain length of CPTs (Kharel et al., 2006;Noike et al., 2008). Recently, six additional 1 1 tomato CPTs (SlCPT2 through SlCPT7) were identified and characterized that synthesize prenyl diphosphates ranging between C15 and C65 in length (Akhtar et al., 2013). The enzymes that synthesize the shorter chain length products lack amino acid residues within a variable domain downstream of CPT conserved domain III. Homology models with acceptable QMEAN z-score values ranging between -1.52 and -3.44 were generated between five additional tomato CPT enzymes (SlCPT2, 4, 5, 6 and 7) and various CPT templates (Supplemental Fig. S2). A reliable model was not obtained for  Fig. S2). Together, these data suggest a correlation between the length of the third alpha-helix in the tomato CPTs, the size of the resulting hydrophobic cleft between the second and third helices, and the product chain length.
The structural models of NDPS1 and zFPS are similar to each other with no difference in the length of the third alpha helix predicted ( Fig. 5A and B), suggesting that the different catalytic activities of these enzymes is unlikely to be caused by major structural changes within this region. However, several of the divergent amino acids between NDPS1 and zFPS are located within CPT conserved domain II and III, which lie within or adjacent to either the second or third alpha helix ( Fig. 5; Table I). In particular, the relative positions of the aromatic amino acids corresponding to Tyr100 in NDPS1 and Phe107 in zFPS, which are located within the second alpha helix within CPT conserved domain II, differ substantially between the two proteins ( Fig. 5; Table I). Notably, the presence of bulky amino acids within the hydrophobic cleft between helices II and III is known to influence product chain length of CPTs (Kharel et al., 2006;Noike et al., 2008), suggesting that the relative positions of NDPS1 Tyr100 and zFPS Phe107 may impact the substrate and product specificities of these enzymes.

2
The relative positions of residues within region II determine the substrate and product specificities of NDPS1 and zFPS As the domain II residues, and particularly the relative positions of Tyr100 and Phe107, constitute the major differences between the NDPS1 and zFPS structural models (Fig.   5), we converted the four conserved amino acids within domain II, Glu98, Tyr100, Ile106 and Ile107 of NDPS1 to those present in zFPS, generating NDPS1-M4 (Fig. 6A). We also converted these four domain II amino acids in zFPS-M4 to the four residues from NDPS1. Incubation of zFPS-M4 with DMAPP and IPP led to the synthesis of NPP at a similar amount to that observed for NDPS1 from LA2409 (Fig. 6B). Furthermore, recombinant zFPS-M4 protein was unable to utilize NPP as a substrate (Fig. 6C).
Conversely, the NDPS1-M4 recombinant enzyme displays catalytic properties that are similar to zFPS and is able to utilize both DMAPP + IPP and NPP + IPP to synthesize Z,Z-FPP. Together, these data indicate that the residues within conserved domain II of NDPS1 and zFPS, which lie within the second alpha-helix, are important for determining substrate and product specificity of these short chain CPTs.
To test the hypothesis that the relative positions of Tyr100 in NDPS1 and Phe107 in zFPS are responsible for determining their activities, site-directed mutagenesis was performed to generate mutant versions of each enzyme in which the positions of these aromatic amino acids were switched and replaced with those present in the other enzyme (Fig. 6A). The resultant NDPS1 Tyr100Ser, Ile107Phe and zFPS Ser100Tyr, Phe107Ile constructs were designated NDPS1-M2 and zFPS-M2, respectively. The activity of each of the M2 constructs was determined using both DMAPP and IPP and NPP and IPP as substrates. Whereas the NDPS1-M4 construct gained the activity typically associated with zFPS, the M2 construct synthesized a mixture of C10 and C15 prenyl diphosphates when using DMAPP and IPP as substrates (Fig. 6B). In addition, while the NDPS1-M2 enzyme was able to utilize NPP as a substrate to synthesize Z,Z-FPP, this reaction was less efficient than was observed with the NDPS1-M4 and zFPS enzymes (Fig. 6C). In contrast, the activity of the zFPS-M2 enzyme was similar to that of zFPS-M4 and mirrored that of the wild type NDPS1 enzyme (Fig. 6). However, the zFPS-M2 enzyme also retained a residual level of activity when supplied with NPP and IPP, although this was minor when compared to the activity of zFPS from LA1393 (Fig.   6C). In combination, these data indicate that while the relative positioning of the bulky aromatic amino acids between the predicted second and third helices of these short chain CPTs are sufficient to convert the substrate and product specificity of zFPS to that of NDPS1, they are insufficient to confer the complete reciprocal change to NDPS1.
The dimensions and structure of the hydrophobic cleft of CPTs may be particularly important for specifying enzyme activity when multiple rounds of chain elongation occur, as is the case when Z,Z-FPP is synthesized from DMAPP and IPP. As such, the amino acids immediately flanking the aromatic residues within conserved domain II could be important for determining the CPT activity through modifying their spatial orientation. To examine the role of the additional residues within the hydrophobic cleft in more detail, two additional mutants of NDPS1; NDPS1-M3V1 (Tyr100Ser, Ile106Leu, Ile107Phe) and NDPS1-M3V2 (Glu98Asp, Tyr100Ser, Ile107Phe) were constructed (Fig. 6A). In particular, we were interested to determine whether the Ile106Leu substitution, which lies immediately adjacent to Phe107 in the NDPS1-M4 and NDPS1-M2 constructs, influences the activity of these recombinant enzymes.
The activity of the resulting NDPS1-M3V1 recombinant enzyme more closely resembled zFPS and produced predominantly Z,Z-FPP, but still produced NPP as approximately 10% of the total product whereas the NDPS1-M3V2 enzyme produces NPP and Z,Z-FPP in approximately equal proportions (Fig. 6B). Similarly, the NDPS1-M3V1 and NDPS1-M3V2 enzymes utilized NPP and IPP as substrates at similar relative proportions to that observed when using DMAPP and IPP as substrates (Fig. 6C).
Furthermore, as anticipated from the activities of the zFPS-M4 and M2 enzymes, both zFPS-M3V1 (Asp98Glu Ser100Tyr, Phe107Ile) and zFPS-M3V2 (Ser100Tyr, Leu106Ile, Phe107Ile) enzymes synthesize NPP when supplied with DMAPP and IPP as substrates and essentially lost the ability to efficiently utilize NPP as a substrate ( Fig. 6B and C). Overall, these data indicate that substitution of the aromatic amino acids at positions 100 and 107 are sufficient to convert zFPS into NDPS1 but the complete www.plantphysiol.org on August 29, 2017 -Published by Downloaded from Copyright © 2013 American Society of Plant Biologists. All rights reserved. reciprocal change in activity from NDPS1 to zFPS requires additional amino acid substitutions at positions 98 and 106.

Kinetic analysis of mutant variants of NDPS1 and zFPS
The activities of mutant variants of NDPS1 and zFPS revealed qualitative changes in the synthesis of prenyl diphosphates that are consistent with a role of amino acid residues within CPT conserved region II as determinants of substrate and product specificity (Figs. 4 and 6). Furthermore, the majority of the mutant enzymes synthesize prenyl diphosphates at abundances similar to those of the corresponding parent enzyme. For example, NDPS1 and zFPS-M10 synthesize approximately the same amounts of NPP when supplied with DMAPP and IPP as substrates. However, it is possible that the kinetic properties of the mutant enzyme variants differ from the wild type enzymes and these differences may not be apparent if substrate concentrations are limiting. Therefore, kinetic analysis was performed on wild type and select mutant variants of both NDPS1 and zFPS.
Kinetic analysis of NDPS1 from LA2409 revealed a K m for DMAPP and IPP of 112 and 140 µM, respectively (Table II), which are similar to the values of 177 and 152 µM previously reported for S. lycopersicum NDPS1 (Schilmiller et al., 2009). The K m of the M10 and M2 variants of zFPS using DMAPP as a substrate, are similar to that obtained for NDPS1 although the K cat and subsequent catalytic efficiency of the zFPS-M2 enzyme is slightly lower than that observed for NDPS1. The K m for IPP of zFPS-M2 is lower than those measured for the NDPS1 and zFPS-M10 enzymes, although the catalytic efficiencies of each enzyme are similar (Table II). The K m of zFPS from LA1393 varied slightly when compared to previously published data (Sallaud et al., 2009). The 1 6 either mono-or sesquiterpenes (Table I; Supplemental Fig. S1). Reciprocal amino acid substitutions involving 10 of these residues confirmed that they were sufficient to switch the activity of NDPS1 to an enzyme that shares characteristics of zFPS and to convert zFPS to an enzyme that possesses NDPS1 activity (Fig. 4). Subsequent refinement using a combination of homology modeling, site directed mutagenesis and enzyme activity assays, supplemented by kinetic analysis of recombinant enzymes led to the discovery that the relative positioning of aromatic amino acids and immediately adjacent residues within the conserved CPT domain II are important determinants of substrate and product specificity of NDPS1 and zFPS (Figs. 5 and 6; Table II). Notably, the amino acid changes within the CPT domain II that specify either NDPS1 or zFPS activity require only one or two nucleotide changes ( Fig. 6; Table I). These data are congruent with previous studies of terpene biosynthesis that indicate that single nucleotide changes, which may occur relatively freely during evolution, can lead to qualitative changes in product profiles (Kollner et al., 2004;Kampranis et al., 2007;Xu et al., 2007;Keeling et al., 2008;Kollner et al., 2009;Green et al., 2011).
The chain length of isoprenoids varies dramatically from C10 in the case of monoterpenes to >C10 000 for natural rubber, and this wide range of diverse products is catalyzed by prenyltransferases with different substrate and product specificities (Takahashi and Koyama, 2006). In the case of the trans-prenyltransferases, which are structurally and mechanistically distinct from the cis-prenyltransferases, bulky aromatic residues at the interface between two alpha helices are known to determine product chain length and a combination of crystallography, structural modeling, bioinformatics, and biochemical analyses has led to the development and validation of predictive models of their functional properties (Ohnuma et al., 1996;Tarshis et al., 1996;Ohnuma et al., 1997;Ohnuma et al., 1998;Wallrapp et al., 2013).
Although less well defined, determinants of substrate and productivity of cisprenyltransferases have also been reported. These studies, in which mutagenesis of specific residues and the incorporation of additional residues at the flexible loop adjacent to domain III allowed products of longer chain length to be synthesized, highlighted the importance of the number of amino acid residues within the third alpha helix and residues within conserved domain III for the determination of product chain length (Kharel et al., 2006;Noike et al., 2008). For example, alanine substitution of bulky amino acids such as Leu84 in the Z,E-farnesyl diphosphate synthase of Mycobacterium tuberculosis allows an increase in product chain length over the wild type enzyme (Noike et al., 2008). In the structural models of CPTs (Fig. 5), domain III residues form the third alpha helix and the preceding disordered loop (Fig. 5C), which together with the second alpha-helix, forms the hydrophobic cleft close to the substrate binding sites where chain elongation occurs (Guo et al., 2005;Noike et al., 2008). The Solanum short chain CPTs encoded by CPT1, CPT2, and CPT6 lack amino acid residues downstream of CPT domain III (Akhtar et al., 2013), which shortens the length of the third alpha helix in these proteins compared to those of CPT4, CPT5, and CPT7 that catalyze the formation of longer chain cis-prenyl diphosphates (Akhtar et al., 2013;Supplemental Fig. S2). Therefore, a correlation exists between the length of the third alpha helix and the product chain length. However, this correlation does not explain the difference in product chain length of NDPS1 and zFPS and our data indicate a hereto unknown role for the relative positioning of aromatic residues within conserved domain II, which is part of the second alpha helix (Fig. 5), as important for specifying the relative activity of these short-chain CPTs. Further experiments are required to determine whether this region has broad utility across diverse CPTs.
In general, the function and evolutionary relationships of plant CPTs, together with the structural features that determine their substrate and product specificity, remain poorly understood. By utilizing an integrated approach involving comparative sequence analysis of chemically distinct germplasm, homology modeling, and site-directed mutagenesis, coupled with enzyme activity assays and kinetics, we have identified and characterized allelic diversity at the CPT1 locus within a natural population of S.
habrochaites. This allelic diversity causes variation between NDPS1 and zFPS through altering the relative position of aromatic amino acids within a structurally conserved hydrophobic cleft that is important for product chain length determination of CPTs. This variation led to the formation of Z,Z-FPP in the chloroplasts of type VI glandular