Amino Acid Residues Critical for the Specificity for Betaine Aldehyde of the Plant ALDH10 Isoenzyme Involved in the Synthesis of Glycine Betaine 1

Plant ALDH10 enzymes catalyze the oxidation of ω -primary or ω -quaternary aminoaldehydes but, intriguingly, only some of them, such as the Spinacia oleracea betaine aldehyde dehydrogenase (SoBADH), efficiently oxidize betaine aldehyde (BAL) forming the osmoprotectant glycine betaine (GB), which confers tolerance to osmotic stress. The crystal structure of SoBADH reported here shows Tyr-160, Trp-167, Trp-285, and Trp-456 in an arrangement suitable for cation-π interactions with the trimethylammonium group of BAL. Mutation of these residues to alanine resulted in significant K m (BAL) increases and V max / K m (BAL) decreases, particularly in the Y160A mutant. Tyr-160 and Trp-456, strictly conserved in plant ALDH10s, form a pocket where the bulky trimethylammonium group binds. This space is reduced in ALDH10s with low BADH activity because an isoleucine pushes the tryptophan against the tyrosine. Those with high BADH activity instead have alanine (Ala-441 in SoBADH) or cysteine, which allow enough room for binding of BAL. Accordingly, the mutation A441I decreased V max / K m (BAL) of SoBADH ∼ 200-times while the mutation A441C had no effect. The kinetics with other ω -aminoaldehydes were not affected in the A441I or A441C mutants, demonstrating that the existence of an isoleucine in the second sphere of interaction of the aldehyde is critical for discriminating against BAL in some plant ALDH10s. A survey of the known sequences indicate that plants have two ALDH10 isoenzymes; those known to be GB-accumulators have a high-BAL-affinity isoenzyme with alanine or cysteine in this critical position, while non-GB-accumulators have low-BAL-affinity isoenzymes containing isoleucine. Therefore, BADH activity appears to restrict GB synthesis in non-GB-accumulator plants.


INTRODUCTION
Osmotic stress caused by drought, salinity or low temperatures is a major limitation of agricultural production. Some plants synthesize and accumulate glycine betaine (GB)⎯the most efficient osmoprotector known (Courtenay et al., 2000)⎯when subjected to osmotic stress (Yancey et al., 1982;Hanson and Wyse 1982;Weretilnyk et al., 1989;Valenzuela-Soto and Muñoz-Clares, 1994). It is generally accepted that GB is synthesized in the chloroplast stroma, as it is in spinach (Hanson et al., 1985), by a two-step oxidation of choline: first the alcohol group of choline is oxidized to the aldehyde group of betaine aldehyde (BAL) in a reaction catalyzed by choline monooxygenase (E.C.1.14.15.7, CMO), an enzyme unique to plants (Burnet et al., 1995); then the aldehyde group of BAL is oxidized to the acid group of GB in a reaction catalyzed by plant betaine aldehyde dehydrogenase (betaine aldehyde: NAD(P) + oxidoreductase, E.C. 1.2.1.8, BADH) (Hanson et al., 1985;Weretilnyk and Hanson, 1989;Arakawa et al., 1987;Valenzuela-Soto and Muñoz-Clares, 1994;Burnet et al., 1995;Hibino et al., 2001;Nakamura et al., 2001;Fujiwara et al., 2008;Kopěcný et al., 2011), an enzyme that belongs to the aldehyde dehydrogenase (ALDH) family 10 (ALDH10) (Vasiliou et al., 1999). Engineering the synthesis of GB in crops that naturally lack this ability has been a biotechnological goal for improving tolerance to osmotic stress (McNeil et al., 1999;Rontein et al., 2002;Waditee et al., 2007). The several attempts made so far have had limited success, stressing the need for a better understanding of the structural and functional properties of the enzymes involved in the GB biosynthetic pathway.
Most biochemically characterized plant ALDH10s appear to be ωaminoaldehyde dehydrogenases (AMADHs) that can oxidize small aldehydes possessing an ω-primary amine group, such as 3-aminopropionaldehyde (APAL) and 4-aminobutyraldehyde (ABAL) (Vojtěchová et al., 1997;Trossat et al, 1997;Šebela et al, 2000;Livingstone et al., 2003;Oishi and Ebina, 2005;Fujiwara et al., 2008; the wild-type and the mutant enzymes, in experiments in which the concentration of NAD + was varied at a fixed concentration of BAL (a high but non-inhibitory concentration which was at least 4-times the K m (BAL) value of each enzyme) (Supplemental Table S2).
The wild-type SoBADH exhibited kinetic parameters for BAL, APAL and ABAL (Table I and Fig. 2D) similar to those reported earlier for this enzyme (Incharoensakdi et al., 2000). The kinetics of TMABAL were comparable to those of APAL and ABAL. The V max and K m values determined using BAL as substrate were between 4-and 9-times and between 12-and 26-times higher, respectively, than the values obtained using the other ω-aminoaldehydes, whose tighter binding to the enzyme appears to correlate with a slower catalysis. This results in a higher (between 1.2-and 4.3-times) catalytic efficiency (measured as V max /K m ) for the other ωaminoaldehydes tested than for BAL (Table I and Fig. 2D). The mutant enzymes exhibited significantly increased K m (BAL) and decreased V max /K m (BAL) values, particularly the Y160A mutant, which had 140-times higher K m (BAL) and a 550times lower V max /K m (BAL) than the wild-type SoBADH (Table I and Fig. 2D), indicating that the tyrosine aromatic ring is of the utmost importance for binding of the aldehyde. On the basis of the observed changes in K m (BAL), Trp-285 also appears to be very important for the productive BAL binding, followed by Trp-167 and last by . Although in our energy-minimized model Trp-285 is the farthest from the methyl groups of BAL, the significant effect of its mutation may be in part due to the loss of the van der Waals interactions of its indole ring carbon atoms CH2 and CZ3 with the phenol oxygen of Tyr-160. These interactions contribute to maintain the position of Tyr-160 that allows binding of the trimethylammonium group (see below).
The mutant enzymes exhibited decreased V max values, but to a lesser extent than the increases in K m values. The exception was the W167A mutant, which had a greater effect on V max than in K m . Interestingly, the mutation of the four aromatic residues had a much lesser effect on the kinetics of SoBADH with APAL, ABAL and TMABAL than on those of BAL (Table I and Fig. 2D). The mutant enzymes W167A, W285A and W456A exhibited small changes in K m for APAL, ABAL and TMABAL compared with the wild-type enzyme, indicating that their affinity for these aldehydes has not been importantly affected and, therefore, that these aromatic residues do not contribute to their binding. The V max values determined using these aldehydes as substrates were not importantly affected by the mutation of these three aromatic residues. Consequently, neither were the V max /K m values. The Y160A mutant, however, exhibited an 80-times increase in the K m value for APAL, and around 20times increases in the K m values for ABAL and TMABAL with respect to the wildtype SoBADH, resulting in importantly decreased V max /K m values. Multiple alignments of the known plant ALDH10 amino acid sequences indicated that Tyr-160, Trp-167 and Trp-456 are strictly conserved residues in these enzymes, whereas Trp-285 is a phenylalanine or an alanine in some of them. On the basis of our results, it could be speculated that those enzymes with an alanine in the position equivalent to Trp-285 in SoBADH would use APAL, ABAL and TMABAL as substrates preferentially to BAL.

enzymes
To date, there are only three plant ALDH10 enzymes whose three-dimensional structure is known: the SoBADH reported here (PDB code 4A0M) and two isoenzymes from Pisum sativum (PsAMADH1 and PsAMADH2, PDB codes 3IWK and 3IWJ, respectively) (Tylichová et al., 2010). As none of the pea enzymes can use BAL as substrate (Šebela et al., 2000;Tylichová et al., 2010) whereas the spinach enzyme uses this aldehyde very efficiently, we compared the active sites of SoBADH and PsAMADH2 to find out the structural reasons for this important difference between them. We choose PsAMADH2 for this comparison because it has a tryptophan residue in the position of Trp-285 of SoBADH whereas PsAMADH1 has a phenylalanine. The superposition of the aldehyde-binding sites of the two enzymes shows that every residue lining the aldehyde entrance tunnel has a very similar conformation in both of them, with the exception of the side-chain of the tryptophan residue equivalent to , which in the pea enzyme is closer to the phenol group of the side-chain of the tyrosine residue equivalent to Tyr-160 (Tyr-163 in PsAMADH2) than in the spinach enzyme. This results in a narrower cavity in PsAMADH2 than in SoBADH at the place where the bulky trimethylammonium group of the BAL should be accommodated (Figs. 3A and B).
Energy-minimized models of the productively bound BAL molecule indicated that the trimethylammonium group can be bound in SoBADH but that it clashes with the 1 0 tryptophan residue in PsAMADH2 (Fig. 3C). The reason for this is the different position of the tryptophan side-chain in PsAMADH2, which is pushed towards the phenol group of the tyrosine by the side-chain of a non-active site residue-an isoleucine (Ile-444, PsAMADH2 numbering)-, which is behind and in close contact with the indole ring, at van der Waals distance (3.6 Å). Instead of Ile-444 SoBADH has Ala-441, whose much smaller side-chain allows Trp-456 to be at a distance from Tyr-160 sufficient for accommodating the trimethylammonium group of BAL (Fig.   3C). To investigate whether this difference in a residue in the second sphere of interaction of the aldehyde could account for the differences in BAL specificity between the spinach and pea enzymes, we individually mutated Ala-441 in SoBADH for an isoleucine. As expected, the mutant A441I enzyme showed a greatly reduced affinity for BAL, indicated by a K m BAL value 23-times higher than that of the wildtype enzyme. It also exhibited a ∼7-times lower V max , which result in a decrease in the catalytic efficiency of the mutant A441I with BAL as substrate, V max /K m (BAL), of around 160-times. On the contrary, the K m and V max values for the other ωaminoaldehydes were very similar to those of the wild-type enzyme (Table II and Figs. 3D and E). These results confirm our hypothesis of the critical importance of the side-chain of the residue behind the indole ring of the tryptophan residue equivalent to Trp-456 for plant ALDH10s to discriminate against BAL.
The only other plant ALDH10 that so far has been found to have a very poor affinity for BAL is the barley BBD1 (Fujiwara et al., 2008), an enzyme that also has an isoleucine in this position. Interestingly, barley has another ALDH10 isoenzyme with a high affinity for BAL, called BBD2, which possesses a cysteine residue in the position of Ala-441 of SoBADH (Cys-439, BBD2 numbering). We anticipated that in this enzyme the steric impediment for the binding of the trimethylammonium group of BAL does not occur because the small size of the cysteine side-chain. To prove this, we constructed the A441C mutant and confirmed that this change did not affect the V max and had a slight negative effect on K m (BAL), which was increased 1.7-times compared with the wild-type value (Table II and Figs. 3D and E). These findings give additional support to our proposal of the critical importance of a small residue behind the indole group of the tryptophan residue for allowing BAL binding. The saturation kinetics of APAL, ABAL and TMABAL were not affected in the A441C mutant, as it was also expected.

DISCUSSION
The positive charge of the quaternary nitrogen of the trimethylammonium group of BAL suggests that negatively charged active-site residues should be involved in conferring substrate specificity to plant BADHs, by analogy with other enzymes that binds this group (Quaye et al., 2008). In a first study with SoBADH, Glu-103, which is strictly conserved in the known plant ALDH10 enzymes, was thought to be this residue; when changed to glutamine, however, there were no change in the kinetics with BAL as substrate and only a small negative effect on those with APAL and ABAL (Incharoensakdi et al., 2000). The crystal structure of the spinach enzyme reported here explains these results: the side-chain carboxylic group of Glu-103 is far from the aldehyde tunnel (Supplemental Fig. S3). ALDH10 enzymes also have two conserved aspartates (Asp-107 and Asp-110, SoBADH numbering), whose carboxyl groups are exposed, or partially exposed in the case of Asp-107, to the solvent filling the aldehyde tunnel. The energy-minimized model of the productively bound BAL indicates that these carboxyl groups are too far away from the trimethylammonium group to directly interact with it. While this paper was in preparation, Kopečný et al. (2011) reported marked decreases in the affinity for APAL and ABAL of PsAMADH2 mutants in which these two aspartates were changed to alanines. Our energy-minimized models with these aminoaldehydes productively bound (not shown) showed that Asp-107 and Asp-110 are more than 7.5 Å away from the amino group of ABAL and APAL. Since the carboxyl of Asp-110 is relatively close to that of Asp-107, and therefore may influence its position, and the carboxyl group of Asp-107 is at appropriate distance from Tyr-160 to electrostatically interact with the aromatic ring, the observed negative effects of the mutations D107A and D110A in PsAMADH2 may be, at least in part, due to the loss of this latter interaction, which may be relevant for the correct positioning of the tyrosyl residue equivalent to Tyr-160, a residue which is critical for the binding of the aldehydes (see below).
On the other hand, the trimethylammonium group of choline or GB has been shown to bind to proteins mainly through cation-π interactions with aromatic residues (Holtmann et al., 2004;Horn et al., 2006). The crystal structure of SoBADH showed four aromatic residues, Tyr-160, Trp-167, Trp-285, and Trp-456 in an arrangement suitable for cation-π interactions with the trimethylammonium group of BAL (Fig. 2).
The residue equivalent to Trp-167 was suspected to participate in binding BAL in the BADH from cod liver (Johansson et al., 1998), which is an ALDH9 not an ALDH10 enzyme (Vasiliou et al., 1999), but this possibility was discarded on the basis that this residue is conserved in several other ALDHs that are not specific for BAL. The The kinetics of the SoBADH mutant enzymes in which the four aromatic residues were separately changed were consistent with the involvement of these residues in binding of the trimethylammonium group. Our results indicate that Tyr-160 is an important residue for binding the shortest ω-aminoaldehydes BAL and APAL, particularly for BAL, but also for binding of ABAL and TMABAL in spite of their amino group being farther from the aromatic ring. Interestingly, the K m for propionaldehyde, which lacks the amino group, is increased 50-times in the SoBADH  (Fig. 1). The biochemical characterization of the ALDH10 isoenzymes from amaranth, spinach, barley, rice (Oryza sativa), pea, maize, and Arabidopsis thaliana supports our proposal that those of them having alanine or cysteine exhibit a high activity with BAL while those having isoleucine have a poor activity with this substrate (Supplemental Table S3). The exception is the isoleucinecontaining isoenzyme from Avicennia, which was reported to have a high affinity for BAL and to be unable of using APAL and ABAL as substrates (Hibino et al., 2001).
Interestingly, all known ALDH10s that we propose as high-BAL-affinity isoenzymes contain a tryptophan in the position equivalent to Trp-285 of SoBADH, while several low-BAL-affinity isoenzymes have an alanine, phenylalanine, proline or serine instead. This is consistent with our results of the kinetics of the W285A mutant.
Different ALDH10 genes have been reported to exist in the genome of several plants ( McCue and Hanson, 1992;Ishitani et al., 1995;Wood et al., 1996;Legaria et al., 1998, Hibino et al., 2001Bradbury et al., 2005)  plants. This is supported by our finding that the plant species in which one of the two isoenzymes is a high-BAL-affinity enzyme according to our criterion of possessing alanine or cysteine in the position of Ala441 of SoBADH, as well as other plants in which there have been found so far only one isoenzyme with alanine in this position, such as spinach and sugar beet (Beta vulgaris), or with cysteine, such as wheat (Triticum aestivum), have been reported as being GB-accumulators, whereas those plants that only have low-BAL-affinity isoenzymes are reported as lacking the ability to accumulate GB (Supplemental Table S3). Moreover, a functional CMO has only been found in species of Amaranthaceae that have the high-BAL-affinity isoenzyme: amaranth (Russell et al., 1998;Meng et al., 2001), orache (Atriplex hortensis) (Shen et al., 2002), spinach (Rathinasabapathi et al., 1997), sugar beet (Russell et al., 1998, whereas a non-functional gene was found in rice (Luo et al., 2007) and the recombinant CMO protein from Arabidopsis has no activity (Hibino et al., 2002).
There are other CMO sequences deposited in GenBank, but it is not yet known whether the CMO proteins in these plants are functional or not.
Although there are no experimental data concerning the subcellular location of most of the ALDH10 enzymes, the presence of alanine or cysteine in position 441 of SoBADH correlates with a chloroplastic location in some of them while the presence of an isoleucine correlates with a peroxisomal location in others, but this is not a general rule (Supplemental Table S3). Thus, in Amaranthaceae the high BAL-affinity isoenzymes lack the carboxy-terminal tripeptide SKL that has been considered as a signal for transport into peroxisomes (Gould et al., 1988), and the chloroplastic location of the spinach enzyme has been experimentally determined (Weigel et al., peroxisomal location has been proved for the low-BAL-affinity isoenzymes from barley (Nakamura et al., 1997) and from the second Arabidopsis isoenzyme (Missihoun et al., 2011), both of which have the C-terminus SKL tripeptide.

CONCLUSION
The first crystal structure of a plant BADH, that from spinach, together with site-directed mutagenesis studies provide for the first time the experimental evidence of the aromatic residues involved in binding of the trimethylammonium group of BAL in plant ALDH10 enzymes, and, importantly, of the main structural feature determining whether they accept betaine aldehyde as substrate: a non-active site amino acid residue located in the second sphere of interaction of the aldehyde bound inside the active site. If this is a small residue, alanine or cysteine, the enzyme will be a true BADH, whereas if this residue is an isoleucine, it pushes an active-site residue so that the cavity where the bulky trimethylammonium group of BAL binds is narrowed and the binding of BAL prevented. Consequently, the activity with BAL will be low and the enzyme can be described as an AMADH. This conclusion is confirmed by the previously reported biochemical characterization of some of these enzymes. A survey of the known plant ALDH10 sequences indicates that the presence or absence of the high-BAL-affinity ALDH10 isoenzyme in plants correlate with them being a GB-accumulator or non-GB-accumulator, respectively. Therefore, the lack of the high-BAL-affinity ALDH10 isoenzyme appears to be a major limitation for GB biosynthesis in plants.
1 9 from 10 to 250 mM in Buffer B. Imidazole excess was removed by centrifugal concentration, using Amicon Ultra 30 (Millipore), while Buffer B was replaced by Buffer A. To get rid of small contaminants, in the crystallization experiments a final step of purification through a Mono Q HR5 column (GE Healthcare) connected to a HPLC system (Waters) was included.

Site-directed mutagenesis
The plasmid pET28-SoBADH, containing the full sequence of the spinach badh gene and a N-terminal His-tag, was used as template for site-directed mutagenesis, which was performed via polymerase chain reaction (PCR) using the Quick Change XL-II Site Directed Mutagenesis system (Agilent) and the following

Activity assay and kinetic characterization of the wild-type and mutant SoBADH enzymes
The specific dehydrogenase activities of wild-type SoBADH and its mutants were measured spectrophotometrically at 30 °C by monitoring the increase in the absorbance at 340 nm (ε = 6,220 M -1 cm -1 ) in a mixture (0.5 mL) consisting of 50 mM 0 HEPES-KOH buffer, pH 8.0, 1 mM EDTA, and 0.2 mM NAD + and variable concentrations of the aldehydes, or saturating concentrations of the aldehydes and variable NAD + . The exact concentration of the aldehydes was determined by endpoint assays using SoBADH and the standard assay conditions described below and the exact concentration of NAD + by its absorbance at 260 nm using a molar absorptivity of 18,000 M -1 cm -1 (Dawson et al., 1986). All assays were initiated by addition of the enzyme. Each saturation curve was determined at least in duplicate using enzymes from different purification batches. One unit of activity is defined as the amount of enzyme that catalyses the formation of 1 μmol of NADPH per min under our assay conditions. Kinetic data were analyzed by non-linear regression calculations using a

Docking and surface electrostatic potential calculations
Aldehyde molecules were rigidly docked into the active site of the SoBADH and PsAMADH2 three-dimensional structure, so that the carbonyl oxygen makes the known interactions inside the oxyanion hole, using the PyMOL building mode and then energy-minimized using the GROMOS 96 force field potential (van Gunsteren et al., 1996) of the Swiss PDB Viewer software (Guex and Peitsch, 1997). The convergence criterion was a value of 0.05 kJ/mol for the averaged derivative. The trimethylammonium group of BAL was non-rigidly docked in to the SoBADH after removing NAD + and glycerol molecules from the crystallographic structure using the PatchDock server (Schneidman-Duhovny et al., 2005)

Retrieval and phylogenetic analysis of ALDH10 orthologs
To obtain a phylogenetically wide sampling of ALDH10 orthologs from plants, we performed a blastp search on the NR collection of the NCBI protein database using the amino acid sequence of SoBADH as query. The maximum number of target sequences was set at 100 and, in order not to exclude any candidate plant ALDH10, the expected threshold was set at 10; the scoring matrix used was BLOSUM62, with gap opening and gap extension costs of 11 and 1 respectively. All sequences retrieved belong to either viridiplantae or eubacteria. In order to determine whether different hits coming from a given species correspond to different isoenzymes, we performed a 1     allows Trp-456 to be more distant from Tyr-160, thus leaving enough room between these two residues for a trimethylammonium group to bind. Images were generated using the UCSF Chimera package from the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, CA (Pettersen et al., 2004). D, Effects of mutation of residue Ala-441 to isoleucine or cysteine on the kinetic parameters of SoBADH with BAL as variable substrate.

Vasiliou V, Bairoch
Enzyme assays were carried out as described in the Experimental section, and the data analyzed as described in Fig. 2D. E, Saturation curves of wild-type SoBADH ( , black) and the mutants A441I ( , red) and A441C ( , green) with BAL, APAL, ABAL, and TMABAL as substrates. Enzyme assays were carried out and the data analyzed as described in Fig. 1D.