A structural basis for the biosynthesis of the major chlorogenic acids found in coffee.

Chlorogenic acids (CGAs) are a group of phenolic secondary metabolites produced by certain plant species and an important component of coffee (Coffea spp.). The CGAs have been implicated in biotic and abiotic stress responses, while the related shikimate esters are key intermediates for lignin biosynthesis. Here, two hydroxycinnamoyl-coenzyme A shikimate/quinate hydroxycinnamoyl transferases (HCT/HQT) from coffee were biochemically characterized. We show, to our knowledge for the first time, that in vitro, HCT is capable of synthesizing the 3,5-O-dicaffeoylquinic acid diester, a major constituent of the immature coffee grain. In order to further understand the substrate specificity and catalytic mechanism of the HCT/HQT, we performed structural and mutagenesis studies of HCT. The three-dimensional structure of a native HCT and a proteolytically stable lysine mutant enabled the identification of important residues involved in substrate specificity and catalysis. Site-directed mutagenesis confirmed the role of residues leucine-400 and phenylalanine-402 in substrate specificity and of histidine-153 and the valine-31 to proline-37 loop in catalysis. In addition, the histidine-154-asparagine mutant was observed to produce 4-fold more dichlorogenic acids compared with the native protein. These data provide, to our knowledge, the first structural characterization of a HCT and, in conjunction with the biochemical and mutagenesis studies presented here, delineate the underlying molecular-level determinants for substrate specificity and catalysis. This work has potential applications in fine-tuning the levels of shikimate and quinate esters (CGAs including dichlorogenic acids) in different plant species in order to generate reduced or elevated levels of the desired target compounds.

Fruits and vegetables are major sources of antioxidants, and high levels of these compounds in the diet are believed to contribute to a reduction of cardiovascular disease and some cancers in people with high intakes of plant foods (Bazzano et al., 2002;Astley, 2003). One important group of antioxidants is the chlorogenic acids (CGAs), soluble esters formed between phenolic hydroxycinnamates and quinic acid. CGAs can be divided into various groups depending on the identity, number, and position of the acyl moiety (Clifford, 2000), with the most common groups being p-coumaroylquinic acids, caffeoylquinic acids (CQAs), feruloylquinic acids (FQAs), and dicaffeoylquinic acids (diCQAs). One or more CGAs are found in important food species such as potato (Solanum tuberosum), tomato (Solanum lycopersicum), apple (Malus domestica), and pear (Pyrus communis), and particularly high levels of 5-CQA and 3,5-diCQA are found in coffee (Coffea spp.) beans (Lepelley et al., 2007;Duarte et al., 2010). In addition to their notable dietary role, the hydroxycinnamate-quinate/shikimate esters are important secondary metabolites with multiple roles in plants. For example, elevated levels of CGAs in transgenic tomato plants have been shown to give increased protection from harmful UV light (Clé et al., 2008) and enhanced microbial resistance (Niggeweg et al., 2004). More recently, it has been shown that the CGAs can act as pest resistance factors in ornamental plants (Leiss et al., 2009). Meanwhile, the closely related shikimate esters are known to be key intermediates in the synthesis of lignin (Hoffmann et al., 2004;Chen and Dixon, 2007).
The CGAs and shikimate esters are synthesized via the phenylpropanoid pathway. Enzymes in the early part of this pathway have been known for several years (Li et al., 2010;Vogt, 2010), but the acyltransferases necessary for the last step in CGA biosynthesis have only more recently been identified and studied (Hoffmann et al., 2003;Niggeweg et al., 2004). These enzymes, the hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferases, reversibly shuttle hydroxycinnamoyl units between their CoA-and shikimate/ quinate-esterified forms. The hydroxycinnamoyl-CoA shikimate hydroxycinnamoyl transferases (HCTs), sometimes referred to as HSTs (Sander and Petersen, 2011), show a stronger preference for shikimate versus quinate, while the hydroxycinnamoyl-CoA quinate hydroxycinnamoyl transferases (HQTs) display the opposite substrate specificity. Both enzymes share over 64% sequence identity and can accept a range of acyl donors, including p-coumaroyl-CoA, caffeoyl-CoA, and feruloyl-CoA, although they exhibit some acyl substrate preferences. HCTs have been widely characterized in multiple plant species (Hoffmann et al., 2003;Lepelley et al., 2007;Wagner et al., 2007;Sander and Petersen, 2011), consistent with the presence of lignin in all vascular plants. In contrast, HQT homologs are more closely associated with CGAaccumulating plants such as coffee, artichoke (Cynara cardunculus ssp. scolymus), tomato, and tobacco (Nicotiana tabacum; Hoffmann et al., 2003;Niggeweg et al., 2004;Sonnante et al., 2010). It is currently thought that the primary route for 5-CQA formation in higher plants is via p-coumaroyl-CoA and quinic acid (Niggeweg et al., 2004), where the caffeic acid is generated by the combined activities of HCT/HQT and a P450 39hydroxylase (Abdulrazzak et al., 2006). The generation of diCQAs and mixed esters is less clear, as the enzyme responsible for their biosynthesis has not yet been identified.
Sequence comparisons indicate that both HCT and HQT belong to the fast-growing plant acyl-CoAdependent BAHD superfamily, named according to the first letter of the first four biochemically characterized enzymes of this family (D'Auria, 2006). This family is involved in the biosynthesis of a large array of secondary metabolites, including a number of important drugs such as the sodium channel blocker ajmaline. Despite their generally low sequence identity, these enzymes possess a sequence-conserved catalytic HXXXD motif and can be further phylogenetically distributed into five (D' Auria, 2006) or eight (Tuominen et al., 2011) distinct clades. To date, three structures from the BAHD family have been solved: vinorine synthase (VS; Ma et al., 2005), an anthocyanin malonyltransferase (Dm3MaT3) in complex with malonyl-CoA (Unno et al., 2007), and two trichothecene acetyltransferases (TRI101 and TRI3) in complex with various CoA moieties and substrates (Garvey et al., 2008(Garvey et al., , 2009. VS, Dm3MaT3, and TRI101 catalyze the transfer of aliphatic moieties such as acetyl or malonyl groups and belong to clade I or II, whereas HCT and HQT are specific for aromatic acyl donors and belong to clade Vb. Based on phylogenic analysis, the BAHD superfamily has rapidly evolved different substrate specificities through convergent evolution, thus making it difficult to predict substrate preferences based on bioinformatics alone (Luo et al., 2007). This is especially true for the BAHD hydroxycinnamoyl transferases, with, for example, HST and rosmarinic acid synthase (RAS) from Coleus blumei having a much broader substrate range than previously thought (Sander and Petersen, 2011).
Clearly, more structural data are needed to help elucidate the substrate specificity of this versatile superfamily. In an effort to understand the remarkable substrate diversity of the BAHD superfamily, we performed structural, biochemical, and mutagenesis studies of HCT and HQT from Robusta coffee (Coffea canephora). Here, we present the crystal structure of a native HCT and Lys-210-Ala/Lys-217-Ala mutant HCT. Based on these structural data, we performed mutagenesis studies of residues likely involved in substrate binding and catalysis. In addition, we show that HCT and the His-154-Asn mutant are able to produce diCQAs in vitro. This novel function is consistent with substratedocking results and offers new molecular insights into this important family of acyltransferases.

Protein Production and Enzymatic Activity
HCT and HQT from Robusta coffee (Lepelley et al., 2007) were overexpressed in Escherichia coli and highly purified for both biochemical and structural studies. Our enzymatic analysis was primarily performed at pH 6.5, as 5-CQA isomerizes significantly to its 3-CGA and 4-CGA forms at higher pH values (Friedman and Jürgens, 2000). We additionally noted that significantly faster isomerization of 5-FQA and 3,5-diCQA occurred above pH 7. To precisely follow the changes in the phenolic substrates and products, we chose a HPLC method using either standard retention times (Supplemental Fig. S1) or HPLC-coupled mass spectroscopy. The forward reaction, with caffeoyl-CoA and either quinate or shikimate as substrate, shows that coffee HCT has a clear preference for shikimate, while HQT prefers quinate (Supplemental Figs. S2 and S3), as expected (Hoffmann et al., 2003;Niggeweg et al., 2004;Sonnante et al., 2010). We also observed that 5-CQA and 5-caffeoylshikimic acid (5-CSA) are the major enzymatic products and that the subsequent 3or 4-isomerization of 5-CQA occurs nonenzymatically in solution. Interestingly, it was noted that after a long incubation time (60 min versus 5 min), when both shikimate and quinate were present at equimolar levels, the bias toward the production of 5-CQA by HQT was reduced.
The reverse reaction catalyzed by HCT and HQT, using 5-CQA and CoA as substrates, shows that both enzymes catalyze the formation of caffeoyl-CoA from 5-CQA but that HCT produced significantly less (Supplemental Fig. S4). When HCT and HQT were assayed with CoA and 5-FQA, activity was only detected with HQT (Supplemental Fig. S5). We then analyzed the effect of different pH values on the reverse reaction of HCT and HQT. These experiments indicated that they have distinct optimal pH values and that neither has detectable activity below pH 4.6. HQT was found to have an optimum close to pH 6.0, similar to previous observations (Niggeweg et al., 2004), whereas the HCT appears to exhibit a pH optimum closer to pH 7.5 to 8.0. Higher pH values were not studied because of interference by 5-CQA isomerization under basic conditions. The apparent pH optimum difference between HCT and HQT could at least partially explain the observation that, at equimolar concentrations, HQT is more active than HCT in converting 5-CQA to caffeoyl-CoA at pH 6.0 to 6.5. While it is currently unclear what level of CGA isomerization occurs in vivo, we tested whether HQT could react with the purified 3-CQA and/or 4-CQA isomers, as this enzyme exhibited higher activity in the reverse reaction. Interestingly, only the 5-CQA isomer efficiently serves as a substrate for HQT, making 5-CQA the most likely in vivo substrate.
Although coffee and other plants produce significant amounts of diCQAs and other more complex esters of quinic acid, the enzymes responsible for their synthesis have not been identified. Thus, when we unexpectedly found small new peaks potentially corresponding to diCQAs after an overnight HCT reaction with 5-CQA and CoA (Fig. 1), we decided to investigate this activity further. More detailed work indicated that these peaks could also be detected with shorter reaction times and that they had retention times and absorption profiles expected for diCQA (Supplemental Fig. S6). Additional experiments suggest that equimolar ratios of 5-CQA and CoA result in low but detectable amounts of diCQA production coupled to very low caffeoyl-CoA production, but a high CoA:5-CQA ratio (10:1) showed high caffeoyl-CoA production and no detectable diCQA production. Other results confirm that 3,5-diCQA is the predominant product and that isomerization to 3,4-and 4,5-diCQA follows. This is also consistent with the observation that diCQA, presumably the 3,5-diCQA isomer, can also serve as a substrate (Supplemental Fig. S7).

Structure Determination and Overall Structure
To understand the catalytic activity and substrate specificity of HCT and HQT, we determined the crystal structure of native HCT. The structure was solved by molecular replacement using VS (Ma et al., 2005) as a search model (Table I). These studies indicated that the HCT "crossover" loop is susceptible to proteolytic cleavage (Lallemand et al., 2012), which could hamper crystallization. To maintain intact protein for crystallization, we generated a protease-resistant HCT by mutating two Lys residues  in this region to Ala. The resulting Lys-mutant HCT (herein referred to as Lys-HCT) is less sensitive to proteolysis, showed comparable catalytic activity to native HCT (Table II; Supplemental Fig. S8), and readily crystallized. Two crystal forms were observed: form 1 (P2 1 2 1 2 1 ) was preponderant, and form 2 (C222 1 ) was obtained only occasionally. The Lys-HCT crystal structures were solved by molecular replacement using native HCT as a model (Table I). The tertiary structure of HCT is composed of two nearly equal-sized chloramphenicol acetyltransferaselike domains (Fig. 2) comprising a large mixed b-sheet flanked by a-helices. This two-domain topology is present in all BAHD superfamily structures to date (Ma et al., 2005). The domains are connected by the large crossover loop between Pro-206 and Thr-224. The two native and three Lys-HCT molecules observed are essentially identical and can be superimposed with a root-mean-square-deviation of between 0.42 and 0.72 Å. The only major structural difference observed between the individual native and Lys-HCT molecules is the conformation of the Val-31 to Pro-37 loop located in the solvent channel region and the subsequent side chain conformation of the active site His-153 (Fig. 3). This results in three distinct conformations for this region, which is adjacent to the active site. The first, observed in one of the native HCT molecules, orients His-153 toward the substrate-binding pocket and is a putative active conformation (Fig. 3A). The second conformation is observed in the other independent molecule of native HCT and in one of the Lys-HCT crystal forms. Here, His-153 forms a p-stacking interaction with His-35 and is again oriented toward the solvent channel in a conformation compatible for catalysis (Fig. 3B). The side chain conformation of the active site His-153 in these structures is most similar to the other BAHD structures solved to date ( Fig. 3D) and, therefore, is the most likely active conformation. The third, found in the other Lys-HCT crystal form, has the His-153 side chain pointing away from the solvent channel and forming a p-stacking interaction with the nearby His-154 (conserved in all HCTs). This also results in the Val-31 to Thr-36 loop adopting a new conformation that blocks access to the active site (Fig. 3C). Based on this loop positioning, it is highly unlikely that this is an active conformation.  Table II. Steady-state kinetic parameters of native and mutant HCT Assays were performed as described in "Materials and Methods," and average values 6 SE (n = 3) are also shown.

Docking Results
Despite many attempts, we were never able to obtain a structure of HCT in complex with CoA, substrates, or products. To probe the active site and identify likely residues involved in substrate binding, we used AutoDock Vina (Trott and Olson, 2010) as a modeling tool in conjunction with the HCT crystal structure. The CoA moiety was first positioned based on a superimposition of the two active HCT conformations and TRI101-CoA-T2 crystal structures (Garvey et al., 2008). The binding site for CoA is highly conserved and consists of residues predominantly associated with the C-terminal domain. The adenosine moiety is positioned at the protein-solvent interface, and the pantetheine arm extends through the solvent channel toward the active site. While there is a relatively high sequence and structure conservation for residues involved in CoA binding in the BAHD family (Supplemental Figs. S9 and S10), the putative CoAbinding site in HCT clearly lacks the typical p-stacking interaction between the adenosine moiety and an aromatic residue. Tyr-252 appears to be the most likely residue to form such an interaction in HCT based on primary sequence and structural alignments, but it would require a conformational rearrangement of the loop containing Tyr-252 (Supplemental Fig. S11). Indeed, in TRI101, the sequence-related Phe-258 undergoes a conformational change when binding CoA (Garvey et al., 2008). The phosphate groups of CoA can form ion-or hydrogen-bonding interactions through the highly conserved residues Lys-240, Ser-254, Arg-289, and Arg-374.
Docking simulations comprising putative substrates 5-CQA, 5-FQA, 5-CSA, 3,5-diCQA, and quinic, shikimic, caffeic, and ferulic acids were performed in the two possible active conformations with and without CoA present in the docking model. A large number of solutions were analyzed, and the most biologically relevant conformations were selected based on the proximity of the C-5 hydroxyl group of the quinate or shikimate moiety to the catalytic His-153. In general, the best docking solutions based on our biochemical data were observed using the native molecule 2 or Lys-HCT crystal form 2. But similar solutions were also obtained using the alternative active site conformation observed in the native molecule 1 (Supplemental Fig.  S12). Within the vicinity of His-153, three substratebinding pockets (SBP1-SBP3) were identified (Fig. 4).
The quinate and shikimic moieties preferentially dock in a pocket we named SBP1 (Fig. 4A). This pocket is predominantly hydrophobic in nature, but there are polar residues that could form hydrogen bonds with the substrate, including the catalytic His-153, Thr-36, Arg-357, and Thr-370 and the more distant Thr-305 and Tyr-397. However, while the Arg-357 side chain is not ordered in all the HCT crystal structures, it can be modeled to form a potential charge-charge interaction with the carboxylate group of either quinic or shikimic acid. We observed that the hydroxycinnamoyl moiety consistently docks in a distinct binding pocket, SBP2 (Fig. 4B). Here, the phenolic ring is sandwiched between Gly-158 and Trp-372, with which it can form a p-stacking interaction, while Ser-38 and Tyr-40 are positioned for hydrogen bonding with the C-4 hydroxyl of the aromatic ring. The main-chain carbonyl  oxygen of Thr-36 is oriented toward the C-5 hydroxyl and would provide an additional hydrogen bond to further stabilize substrate binding. Lastly, the best docking solutions for 3,5-diCQA were found with the hydroxycinnamoyl moiety on the C-3 position of quinic acid in SBP2. The hydroxycinnamoyl moiety at C-5 is accommodated by a third binding pocket (SBP3) near the surface of the protein (Fig. 4F). This conformation results in no steric clashes and brings the C-3 hydroxyl group close to His-153, which is consistent with our biochemical data, where we observed the initial production of 3,5-diCQA and its subsequent nonenzymatic isomerization (Supplemental Figs. S8 and S13).

HCT Mutagenesis and Enzymatic Analysis
To examine the role of specific amino acids on the enzymatic activity of HCT, we selectively changed several amino acids putatively involved in catalysis based on our structural and docking analysis. These mutations were generated using Lys-HCT due to its higher stability and because it has a comparable activity to the native enzyme (Table II; Supplemental  Figs. S8, S13, and S14). The activities of the mutants were compared with the native and Lys-HCT in a forward reaction with caffeoyl-CoA and quinic acid and in a reverse reaction with 5-CGA and CoA using HPLC (Supplemental Fig. S8). To test the hypothesis that His-35 and His-153 are critical catalytic residues, each was mutated to an Ala. The His-35-Ala HCT mutant had a significantly lower production of 5-CQA in the forward direction and also appeared to generate a small amount of free caffeic acid (i.e. hydrolysis of caffeoyl-CoA). In the reverse direction, the His-35-Ala mutant had no detectable reaction with 5-CQA. As expected, the His-153-Ala mutation had no enzymatic activity for the forward or reverse reaction, which is consistent with its role as a catalytic residue.
From sequence analysis, HCTs generally have two adjacent His residues, His-153 and His-154 (Fig. 4G), the former demonstrated to be an important catalytic residue here. Amino acids at position 154 in HQTs are highly variable, and in coffee HQT this residue is an Asn (Fig. 4G). Therefore, this mutation was made and assayed in the forward and reverse reactions. In the forward reaction, His-154-Asn produces 5-CQA as effectively as the native protein but also generates somewhat more diCQA and much less free caffeic acid (Supplemental Fig. S8). The more striking result was seen for the reverse direction, where the His-154-Asn mutant produces significantly more diCQA than the native enzyme ( Fig. 1; Supplemental Fig. S8). To verify this kinetically, we compared the enzymatic activity of this mutant with native and Lys-HCT (Table II). We observed that the Lys-HCT and His-154-Asn mutants have similar kinetic values in the forward reaction and that most of the relatively small kinetic differences appear to be accounted for by the Lys-HCT mutation. Most interesting is the observation that the His-154-Asn mutant is nearly 20-fold faster in the reverse direction (Table II). This is consistent with our HPLC analysis data and confirms that the native and Lys-HCT are biochemically similar, while the His-154-Asn mutant, which produces much more diCQA, is significantly more active in the reverse reaction.
To verify that our substrate-docking results were valid, we characterized additional mutations predicted to be involved in substrate recognition. Two positions in HCT were targeted as likely candidates to be important in determining substrate specificity for shikimate versus quinate. The Leu-400-Thr, Phe-402-Tyr, and Leu-400-Thr/Phe-402-Tyr mutations selected were designed to mimic the HQT acyl acceptor-binding site (Fig. 4, A and G). As expected based on our structural and docking studies, we observed a clear shift in substrate preference from shikimic to quinic acid in the forward reaction (Table II). However, while the double mutation shows a somewhat weaker trend in the reverse reaction, there is still a clear preference for 5-CQA.

DISCUSSION
Our biochemical studies confirm the classification of coffee HCT and HQT as having substrate preferences for shikimic and quinic acid, respectively. To facilitate crystallization studies, a protease-resistant mutant was generated by mutating two Lys residues (Lys-210 and Lys-217) in the crossover loop to Ala. This mutant is proteolytically stable, has a similar biochemical activity to the native protein (Table II; Supplemental Fig.  S8), and proved crucial in obtaining reproducible crystals that diffracted to a high resolution. The overall structure of HCT is similar to other BAHD proteins (Fig. 2), with His-153 from the conserved HX 3 D motif confirmed as the catalytic residue by mutagenesis (Supplemental Fig. S8). Here, we present further structural and mutational analyses that allow the identification of a number of flexible loops and residues important for substrate recognition and catalysis.

Plasticity of the Active Site
The crystal structures for the native and Lys-HCTs differ mainly in the Val-31 to Pro-37 loop and the active site His-153 side chain conformation (Fig. 3). In many protein structures, similar His-His contact pairs to that in HCT have been observed, and they are notably involved in stabilizing secondary structures (Heyda et al., 2010). As His (pK a approximately 6) is generally only protonated in strongly acid conditions, the possibility of charge repulsion at physiological conditions due to Coulombic repulsion is small (Heyda et al., 2010). In both the His-35/His-153 and His-153/ His-154 pairs observed here, the p-stacking constrains rotation of the imidazole ring. While the His-153/His-154 pair blocks substrate binding, the His-35/His-153 pair induces an active site most similar to those in other BAHD structures (Fig. 3), and the significantly reduced biochemical activity of the His-35-Ala mutant confirms its important role in properly orienting His-153 for catalysis (Supplemental Fig. S8).
The conditions used for the crystallization of the native and mutant HCTs are very different, especially the pH values, which are 6.5 and 8.5 to 9.0, respectively. This may account for the conformational differences observed in the active site, as the protonation state of His is dependent on both the pH and local environment. The lower pH used to crystallize the native HCT may induce a Coulombic repulsion (Heyda et al., 2010) and result in the alternative active conformations observed in the two independent molecules in the native HCT crystal. In molecule 1, an alternative active conformation may be generated due to potentially unfavorable interactions between neighboring His residues, whereas the binding of a sulfate ion nearby helps stabilize the active conformation observed in molecule 2. It is likely that the conformation of His-153, and thereby the activity of HCT, is modulated by both the pH and flexibility of nearby loops, particularly the Val-31 to Pro-37 loop. This also helps explain the observation that HCTs have a higher pH optimum than HQTs, which is probably due to sequence differences in these regions, especially the substitution of His-154 for Asn, Ser, or Thr (Fig. 4G). This suggests that modifications to these regions may alter the catalytic properties, such as the pH optimum, of these proteins.

Acyl Acceptor-Binding Site
The SBP1 acyl acceptor-binding site was identified by comparing the docking results of both quinic and shikimic acids with those of 5-CQA, 5-CSA, and 5-FQA. The large size of the substrate-binding channel combined with side chain flexibility, especially Arg-357, resulted in a number of alternative docking possibilities. However, while the specific interactions may differ, the global site was sufficient to identify residues that are likely to be important for binding. Similar results were obtained on docking deoxynivalenol (DON) into the apo-TRI101 structure. The binding site identified overlaps well with the acyl acceptor-binding sites of the TRI101 acetyltransferases (Garvey et al., 2008;Fig. 4, A and D). Residues important for binding include those in the flexible a1-b3 loop of HCT (Fig.  4G) that not only participate in binding the hydroxycinnamoyl moiety but also the acyl acceptor. Others include Thr-305, Thr-370, as well as Leu-400 and Phe-402. TRI101-CoA-DON is the only TRI101 structure where the loop containing Asp-221/Ala-222 at the entrance of the solvent channel is ordered. The Ala-221 amide forms a hydrogen bond with the 5-hydroxyl of DON. In HCT, the loop containing the potentially important Arg-357 occupies a similar position (Fig. 4, A and D), supporting our hypothesis that this residue may participate in acceptor substrate binding.
The main difference between quinic and shikimic acids is the presence of a hydroxyl group at C-1 in quinic acid and a double bond between C-1 and C-2 in shikimic acid, resulting in a different geometry of the polyol ring. In the different HCTs, the strictly conserved Leu-400 and Phe-402 (Fig. 4G) are in the vicinity of the C-1 position of the acceptor molecule (Fig. 4A). The more hydrophobic nature of these residues may favor HCT binding of shikimic acid via van der Waals interactions in the region of the double bond. In all HQTs, the Leu is replaced by a Thr and the Phe by a Tyr (Fig. 4G). These changes facilitate hydrogen-bonding interactions with a hydroxyl at C-1 and favor the binding of quinic acid. In the TRI101 ternary complex, the corresponding residue is a Leu (Fig. 4, D and G), which directly interacts with the T2 mycotoxin/DON backbone ring (Garvey et al., 2008). In addition, this region varies in other clade V family members, including red clover (Trifolium pratense) HCT2 (Sullivan, 2009) and RAS (Berger et al., 2006), which bind very different acceptor molecules. To confirm that this region is important for the acyl acceptor substrate specificity, we characterized the activity of a several HCT mutants (Table II). In the forward reaction, the Leu-400-Thr and Phe-402-Tyr mutations clearly move the substrate preference away from shikimic acid and toward quinic acid, with the double Leu-400-Thr/Phe-402-Tyr mutation demonstrating an additive effect. Therefore, this region is clearly involved in alternative acyl acceptor group binding, and further mutations in this region could be exploited for tuning the active site in order to accept novel natural or synthetic substrates.

Hydroxycinnamoyl Binding Pocket
The SBP2 hydroxycinnamoyl binding pocket identified from the structural and docking studies is sandwiched between Trp-372 and Gly-158 in HCT. In Lys-HCT crystal form 1, a glycerol molecule is bound in this site, and in MaT-malonyl-CoA (Unno et al., 2007), the aliphatic malonyl moiety is pointing toward this pocket (Unno et al., 2007), further supporting our assignment of SBP2 as a bona fide substrate-binding pocket. Indeed, Trp-372 is conserved in many BAHD members, although it is occasionally replaced by other residues, such as a Thr in MaT (Fig. 4G). Based on this, it seems likely that residues lining the opposite side of the hydroxycinnamoyl pocket facilitate the binding of larger moieties. For example, Gly-158 at the entrance of this pocket is conserved in all HCT, HQT, and many other BAHD benzoyl/hydroxycinnamoyl binding members but not in the smaller acyl/malonyl transferases (Fig. 4G). For example, this residue is a Met in TRI101 and a Val in VS, which would block access to this pocket for the larger phenolic substrates (Fig. 4, B and E). An Ala in this position, as observed in MaT and in some benzoyl-binding members (Supplemental Fig. S10), probably allows these proteins to accommodate a larger activated CoA thioester compared with VS or TRI101. Interestingly, in MaT, Arg-178 forms a hydrogen bond with the malonyl group, blocking the substrate-binding pocket from accommodating a hydroxycinnamoyl moiety. A smaller Gly, Ala, or Ser is found at this position in HCT/HQT, while a much larger side chain, such as a Leu in VS or a Gln in TRI101, is found in other BAHD structures (Fig. 4G). Thus, residues lining different parts of the substrate-binding pocket are able to finely tune the available volume to selectively accommodate different substrates.
Our biochemical analysis shows that HCT transfers feruloyl moieties much less efficiently than caffeoyl or coumaroyl moieties, similar to that reported for tobacco HCT (Hoffmann et al., 2003). In contrast, HQT is able to utilize all three substituted hydroxycinnamoyl moieties equally. An analysis of the amino acid changes between HCT and HQT in this pocket suggests that Met-151, which is close to the C-5 hydroxyl of the docked caffeoyl moieties, may be important for this biochemical difference. Met-151 is conserved in all HCTs, except red clover HCT2, where it is a Phe. A large side chain in this position is replaced by a smaller Val in all HQTs (Fig. 4G). Such a mutation may accommodate a methoxyl group on the C-5 and increase the binding of feruloyl moieties, potentially explaining why the coffee HQT efficiently converts ferulate substrates. In addition, the Val-31 to Pro-37 loop, which also forms part of the hydroxycinnamoyl binding site, can adopt different conformations in HCT. In HQT, this loop could adopt an alternative conformation and increase the size of this binding pocket to accommodate methoxylated cinnamic acids. These regions, therefore, may help determine the substrate specificity between hydroxyl and methoxyl substitution on the C-5 aromatic ring. As the feruloyl moiety is important for the synthesis of the guaiacyl and syringyl lignin forms, it may be interesting in the future to explore the biological impact of altering the Met-151 and the Val-31 to Pro-37 loop region of HCT in model plants such as Arabidopsis (Arabidopsis thaliana). But because the methylation reaction that produces feruloyl-CoA occurs downstream of the HCT/HQT enzymes, alterations to favor or disfavor binding of the 5-OMe substrate to these enzymes may produce little effect in many plants. However, the high levels of 5-FQA accumulating in coffee (0.5%-1.0% of dry matter in mature seeds; Lepelley et al., 2007) could be the result of HQT driving the synthesis of 5-FQA when feruloyl-CoA is present. This ability of HQT to move feruloyl-CoA to 5-FQA may explain why there is no evidence in the literature of significant amounts of lignin in coffee seeds, despite the fact that high levels of phenylpropanoid pathway metabolites appear to be passing through HCT/HQT.

HCT Can Produce diCQA
To our knowledge, the in vitro enzymatic synthesis of CQA diesters by HCT has not been reported previously. We initially observed weak diCQA peaks in reactions catalyzed by HCT after overnight incubation with 5-CQA and CoA. These peaks were subsequently confirmed by mass spectrometry analysis (Supplemental Fig. S13). These unexpected results, obtained when using equimolar concentrations of both quinate esters and CoA in combination with relatively long reaction times, suggest that the ester formation seen with short in vitro reaction times may not fully reflect the reality in vivo, as the reaction time scales are much longer. To better understand diCQA production by HCT, we studied the native and mutant HCTs in more detail. Remarkably, the His-154-Asn and His-154-Asn/Ala-155-Leu/Ala-156-Ser HCT mutants produced 4-fold more diCQAs ( Fig. 1; Supplemental Fig.  S8). A major peak corresponding to 3,5-diCQA was always observed first (Supplemental Fig. S13), suggesting that it is synthesized enzymatically and subsequently isomerizes into 3,4-and 4,5-diCQA. Because HCT was observed to produce 3,5-diCQA, this ligand was also used in docking studies. The whole substrate-binding pocket (encompassing SBP1-SBP3) is sufficiently large to accommodate this diester. The quinic moiety was located in SBP1, and the acyl moiety was located on C-3 in SBP2 and on C-5 in SBP3. The increased formation of diesters favored by the His-154 mutants may be influenced by changes in the Val-31 to Pro-37 loop conformation (Fig. 3). These mutants may stabilize an alternative conformation, allowing more space for the acyl moiety to bind in SBP3; however, this would require additional structural studies.

CONCLUSION
From the biochemical, structural, and ligand-docking results presented here, we identified key residues involved in hydroxycinnamoyl-CoA binding and acyl acceptor substrate recognition (Table II; Fig. 4;Supplemental Fig. S8) that are consistent with previous BAHD structures and sequences. We further show that HCT, as well as a double Lys mutant, Lys-HCT, can produce dichlorogenic acids (diCGAs) in vitro, which makes it plausible that HCT is directly involved in the generation of the high levels of diesters found in coffee beans. This may be due to the high concentration of 5-CGA found in coffee beans and the active site plasticity of HCT. This activity may be specific for coffee, as no diCGA production was observed with HCT from globe artichoke (Menin et al., 2010), although the in vitro assay conditions were different. Interestingly, we also observed that a His-154-Asn mutant can significantly increase the level of diCGAs produced, which could be exploited to alter the chlorogenic acid content of other plant species for therapeutic uses.
Members of the BAHD family are key enzymes in the biosynthetic pathway of many pharmaceutically important compounds such as morphine, taxol, and ajmaline (Ma et al., 2005). Therefore, these results in combination with the broad substrate acceptance observed for HST and RAS (Sander and Petersen, 2011) offer an important starting point for the structure-based design of novel pharmaceuticals. Our studies highlight several key regions of HCT that could be mutated for the generation of such compounds. In addition, HCT is involved in the synthesis of lignin, a key obstacle to the development of second-generation biofuels. Recent studies have shown that the dwarfism observed in HCT-RNA interference Arabidopsis plants is most likely derived from the conversion of shikimate to salicylic acid (Gallego-Giraldo et al., 2011). Our results could also be used to modulate the substrate specificity of HCT and form a basis for new research directions in the area of biofuel production from nonfood plant biomass.

Cloning, Expression, and Mutagenesis
Codon-optimized synthetic genes encoding the protein sequences for Robusta coffee (Coffea canephora) HCT (GenBank accession no. EF137954; Lepelley et al., 2007) and HQT (GenBank accession no. EF153931; Lepelley et al., 2007) as well as a double Lys-mutated HCT (Lys-210-Ala and Lys-217-Ala) were obtained from Geneart. The native and Lys-HCT genes were cloned into pProEX_HTb (Invitrogen) as described previously (Lallemand et al., 2012). The synthetic HQT gene was cloned into the pET-28M-SUMO3 vector using the BamHI and XhoI restriction sites. This plasmid was transformed into Escherichia coli BL21(DE3)-pLysS and grown using a similar protocol to that of HCT. Other mutants of HCT (all of which included the double Lys mutations) were generated using the QuikChange kit (Stratagene) and essentially produced as for native HCT.

Purification, Crystallization, and Data Collection
The N-terminal His-tagged native and Lys-HCT were purified using a standard protocol including His tag removal (Lallemand et al., 2012). A similar protocol was used to purify the recombinant N-terminal His-SUMO3-tagged HQT, except that a SenP2 protease was used to cleave the tag and an ionexchange step was included before the gel filtration step. The purified proteins were concentrated to between 10 and 20 mg mL 21 in 20 mM HEPES or Tris, pH 8.0, and 100 mM NaCl for crystallization trials and storage at 280°C. We obtained native HCT crystals once after 3 months, and a data set was collected for one of these crystals to 3.0 Å resolution (Lallemand et al., 2012). The Lys-HCT mutant crystallized readily at 20°C from solutions composed of 15% polyethylene glycol 4000, 0.2 M MgCl 2 , and 0.1 M Tris-HCl, pH 9.0, after 1 to 3 d. The crystals were flash frozen at 100 K after transferring them to identical crystallization conditions containing 20% glycerol. X-ray data were collected on beamline ID14-4 (McCarthy et al., 2009) at the European Synchrotron Radiation Facility. All the x-ray data were integrated and scaled using the XDS suite (Kabsch, 2010), and a summary of the data statistics is given in Table I.

Structure Determination, Refinement, and Docking Calculations
The native HCT structure was solved by the molecular replacement method using the VS structure (Ma et al., 2005) as a search model with Phaser (McCoy et al., 2007). These phases were then improved using the prime-and-switch option in Resolve (Terwilliger, 2004). Several rounds of manual building and structure refinement resulted in final R and R free values of 18.9% and 25.0%, respectively. The Lys-HCT crystal form 1 and 2 structures were solved using the native HCT model in Phaser (McCoy et al., 2007). Several rounds of manual building and refinement resulted in final R and R free values of 16.8% and 20.3% for crystal form 1 and 19.4% and 25.4% for crystal form 2. Refinement was performed with Refmac (Murshudov et al., 1997), structure building used COOT (Emsley and Cowtan, 2004), and MOLPROBITY (Davis et al., 2007) was used for model validation. Refinement and model-building statistics are summarized in Table I. The atomic coordinates and structure factors have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University (http://www.rscb.org) with the accession codes 4G0B (native HCT), 4G22 (Lys mutant crystal form 1), and 4G2M (Lys mutant crystal form 2).

Enzyme Assays and HPLC/Mass Spectroscopy Analyses
For the forward reaction, a standard reaction mixture (100 mL) consisted of 0.1 or 0.2 mM caffeoyl-CoA, 1 or 5 mM quinic acid, or 5 mM shikimic acid in 100 mM sodium phosphate buffer, pH 6.0 or 6.5. The standard reaction mixture for the reverse reaction was composed of 1 or 5 mM CoA and 1 or 3.75 mM 5-CQA, 5-FQA, or 3,5 diCQA in 100 mM sodium phosphate buffer, pH 6.0 or 6.5. All the reactions were started by adding 0.1, 0.5, or 1 mM enzyme, and a standard reaction without any enzyme was always run as a control. After incubation between 0 min and overnight, the reaction was terminated by adding 100 mL of 0.1% formic acid. The reaction mixtures were analyzed using a HPLC-based method with either a dual wavelength (256 and 325 nm) or a photodiode array detector (210-400 nm). HPLC separations were performed either on a Novapak column (4 mm, 4.6 3 250 mm; Waters) or a Nucleosil column (250 3 4 mm, 5 mm; Macherey-Nagel) at 30°C or 25°C. All samples were filtered before injection (0.2-mm membrane; Pall Corp.). HPLC separations used a gradient of 8% to 50% acetonitrile or 20% to 60% methanol containing 0.1% H 3 PO 4 or 0.1% formic acid at a flow rate of 0.8 mL min 21 . The substrates and products were characterized by their retention times in relation to the available standards (Supplemental Fig. S1). Later, tandem mass spectroscopy analysis was carried out to confirm these peak assignments using an API-2000 system from Applied Biosystems in negative mode (Supplemental Figs. S13 and S14). Maximum absorbances for the chlorogenic acids were around 242 and 325 nm, that for CoA was at 256 nm, and those for hydroxycoumaroyl-CoA ester were around 257 and 333 nm. The hydroxycinnamoyl-CoA substrates were initially produced using a purified tobacco (Nicotiana tabacum) 4CL enzyme and purified as described elsewhere (L.A. Allemand and A.A. McCarthy, unpublished data) and later purchased from TransMIT. CGA, CoA, quinate, and shikimate were purchased from Sigma-Aldrich, and 3,5-diCGA (and the 3,4-and 4,5-isomers) were purchased from Biopurify.

Determination of Kinetic Parameters
To evaluate the substrate specificities of the native and mutant HCTs, we used a standard reaction mixture of 40 mM sodium phosphate, pH 6.5, totaling 500 mL at 27°C. In the forward reaction, we used 10 mM p-coumaroyl-CoA as the acyl donor and 0 to 1.5 mM shikimate or 0 to 5 mM quinate as the acyl acceptor. The reaction was started by adding approximately 15 mg of protein, and the enzyme activity was measured as the decrease of the A 340 for pcoumaroyl-CoA consumption. In the reverse reaction, the same enzymes were tested in identical conditions using 6 mM CoA and 0 to 3 mM CGA. The reaction was started by adding approximately 15 mg of protein, and the enzyme activity was measured as an increase in the A 360 for caffeoyl-CoA formation. All measurements were carried out using a Beckman DU-800 UV/Vis spectrophotometer. All initial velocities were linear and did not show any curvature, consistent with saturation over the assay time course. Steady-state kinetic parameters were determined using an initial velocity data fit to the Michaelis-Menten equation (v = V max [S]/K m + [S]) in Kaleidagraph (Synergy Software). Assays using quinate as a substrate exhibited substrate inhibition and were fit to v = (V max [S])/(K m + [S] 3 ((1 + [S])/K i )). Extinction coefficients using previously published values (Ulbrich and Zenk, 1979;Niggeweg et al., 2004) were used, and the p-coumaroyl-CoA was produced as described previously (Wang et al., 2011).

Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. HPLC elution profile for available CGA standards.
Supplemental Figure S2. HPLC elution profiles for HCT incubated with caffeoyl-CoA and quinic or shikimic acid.
Supplemental Figure S3. HPLC elution profiles for HQT incubated with caffeoyl-CoA and quinic or shikimic acid.
Supplemental Figure S4. HPLC elution profiles for HCT and HQT incubated with 5-CQA and CoA.
Supplemental Figure S5. HPLC elution profiles for HCT and HQT incubated with 5-FQA and CoA.
Supplemental Figure S6. HPLC elution profile for diCQA formation by HCT.
Supplemental Figure S7. HPLC elution profile showing diCQA is turned over by HCT.
Supplemental Figure S8. HPLC elution profiles for native and selected mutant HCTs.
Supplemental Figure S9. Sequence and structure alignment of HCT and HQT with various BAHD superfamily members.
Supplemental Figure S11. The CoA binding site of HCT and TRI101.
Supplemental Figure S12. Comparison of 5-CQA docking in the proposed and alternate active site of HCT.
Supplemental Figure S13. HPLC/MS analysis of diCQA production by Lys-HCT.