THE ‘OLD’ EUONYMUS EUROPAEUS AGGLUTININ REPRESENTS A NOVEL FAMILY OF UBIQUITOUS PLANT PROTEINS

Molecular cloning ‘old’ but Euonymus demonstrated that the lectin is a homodimeric protein composed of 152 residue subunits. Analysis of the deduced sequence indicated that EEA is synthesized without a signal peptide and undergoes no post-translational processing apart from the removal of a 6-residue N-terminal peptide. Glycan array screening confirmed the previously reported high reactivity of EEA towards blood group B oligosaccharides, but also revealed binding to high mannose N-glycans, providing firm evidence for the occurrence of a plant carbohydrate-binding domain that can interact with structurally different glycans. BLAST searches indicated that EEA shares no detectable sequence similarity with any other lectin but is closely related evolutionarily to a domain that was first identified in some abscisic acid and salt stress-responsive rice proteins, and according to the available sequence data might be ubiquitous in Spermatophyta. Hence, EEA can be considered the prototype of a novel family of presumably cytoplasmic/nuclear proteins that are apparently ubiquitous in plants. Taking into account that some of these proteins are definitely stress-related, the present identification of the EEA lectin domain might be a first step in the recognition of the involvement and importance of protein-glycoconjugate interactions in some essential cellular processes in Embryophyta.


INTRODUCTION
Plant lectins have been studied for more than a century. Nevertheless the inventory of all carbohydrate-binding domains occurring in plant cells is still incomplete. Until a few years ago, virtually all known plant lectins could be classified into seven families of structurally and evolutionarily related proteins (Van Damme et al., 1998). However, the identification of three novel sugar-binding domains/proteins during the last two years (Kaku et al., 2006;Peumans et al., 2007;Van Damme et al., 2007) leaves little doubt that more carbohydratebinding domains remain to be discovered in plants. Two major problems hamper the discovery of the remaining sugar-binding motifs in plants. First, unless homologous lectins have been identified in other organisms no relevant information is provided by genome/proteome analyses. Second, evidence is accumulating that the expression level of lectins with a specific endogenous role is so low that they escape detection by the currently available activity assays (Van Damme et al., 2004a,b).
Though at present virtually all abundant plant lectins can be classified into well-defined protein families there are still a few exceptions for which sufficient sequence information is not available. One of these 'orphan' lectins is the Euonymus europaeus (spindle tree) agglutinin (EEA). As early as 1954 Schmidt et al. (1954) reported that the fleshy arils surrounding the seeds of the spindle tree contain a lectin with a clear preference for B-type erythrocytes within the human ABO-system. The lectin was isolated for the first time in 1975 by conventional protein purification techniques (Pacak and Kocourek, 1975) and later by affinity chromatography on immobilized polyleucyl hog A+H blood group substance (Petryniak et al., 1977). Though the data shown in both papers indicated that the lectin consisted predominantly of partly disulfide-linked subunits of approximately 17 kDa the molecular structure of the native agglutinin remained unclear. According to Pacak and Kocourek (1975) the lectin is a mixture of isoforms that have a similar molecular weight (Mr) (varying between 119 and 127 kDa) but differ in carbohydrate content (1.9% -4.7%). Petryniak et al. (1977) also distinguished multiple molecular forms but reported a higher Mr (166 kDa) and higher carbohydrate content (approximately 10%) for the native lectin.
Later studies of both the E. europaeus lectin (Petryniak and Goldstein, 1987) and a lectin from the closely related species E. sieboldiana (Yamamoto and Sakai, 1981) yielded no additional information about the molecular structure of the native agglutinins. In contrast to the molecular structure, fairly detailed information was reported about the sugar-binding specificity of the E. europaeus lectin, which was found to be directed against the blood group B substance Galα1-3(Fucα1-2) Galβ1-4GlcNAc (Petryniak et al., 1977;Petryniak and Goldstein, 1987).
This report describes a detailed reinvestigation of the E. europaeus agglutinin (EEA) using a combination of biochemical, molecular and cellular-biological approaches. EEA represents a novel lectin family that shares no significant sequence similarity with any other known lectin family. Glycan array screening experiments confirmed that EEA recognizes the blood group B antigen, but also demonstrated that the lectin interacts with high mannose N-glycans. Interestingly, EEA shares high sequence identity with some previously identified salt stress/ABA responsive proteins from rice (Orysa sativa) (Moons et al., 1995;1997) which are apparently expressed in all terrestrial plants but in no other organisms.

Purification and biochemical characterization of EEA
Since EEA purification using a classical protocol for plant lectin isolation was hampered by the formation of insoluble complexes with endogenous glycoconjugates the crude extract was first fractionated by ion exchange chromatography and gel filtration under conditions whereby the carbohydrate-binding activity of EEA was reversibly inhibited. The resulting protein fraction was fully soluble in an aqueous buffer at neutral pH, and could readily be chromatographed on a column of immobilized ovomucoid to yield a pure water-soluble lectin preparation.
SDS-PAGE of the purified lectin in the presence of β -mercaptoethanol yielded a single polypeptide band of 17 kDa (Fig. 1). The lectin did not contain any covalently bound sugar..
Mass spectrometry of the lectin yielded a single peak with a molecular mass of 16,907 ±2 Da. Edman degradation of the electroblotted 17 kDa polypeptide yielded a single sequence (ATGPTYRVYXRAAPNYNMTV, Suppl. Fig. S1).
Since gel filtration experiments yielded no conclusive results the molecular mass of native EEA was estimated by dynamic light scattering. Dynamic light scattering of the sample revealed that the lectin was largely monodisperse. The scattering peak corresponded to particles having an average hydrodynamic diameter of 5.6 nm consistent with globular protein assemblies of 37 kDa. Given a molecular weight of 16.9 kDa for the monomer, the dynamic light scattering data indicate that native EEA occurs as a dimer.
Though our data confirm the size of the EEA subunits reported before, our lectin preparation did not contain any covalently bound sugars. Moreover, as is demonstrated below EEA is synthesized on free ribosomes and hence cannot be N-glycosylated.
Therefore the relatively high carbohydrate content (2-10%) of the EEA preparations described in previous papers can hardly be ascribed to the lectin itself. Taking into account that EEA tends to form aggregates with endogenous glycoconjugates present in crude extracts, it is likely that the previously purified preparations consisted at least partly of lectin-glycoprotein complexes. The presence of such complexes not only accounts for the carbohydrate found in the lectin preparations described by Pacak and Kocourek (1975) and Petryniak et al. (1977) but also explains why these preparations sedimented with an apparent Mr of 119-127 kDa and 166 kDa, respectively, upon analytical centrifugation.

Molecular cloning of EEA
Screening of a cDNA library prepared from mRNA isolated from developing arilli allowed isolating a cDNA clone with a deduced sequence that perfectly matched the N-terminal sequence of the EEA polypeptide. The cysteine, which is degraded during Edman degradation if it is not alkylated prior to the analysis, corresponded to the blank in the experimentally determined sequence. The cDNA clone comprised an open reading frame of 474 nucleotides corresponding to a EEA precursor sequence (LECEEA) of 158 amino acid residues that contains 6 extra residues preceding the N-terminus of the mature polypeptide (Suppl. Fig. S1). Calculation of the Mr of the polypeptide spanning residues A7 -G158 yielded a value of 16,903.8 Da, which is in good agreement with the value obtained by mass spectrometry of the lectin (16,907 Da). This nearly perfect match in Mr and the occurrence of a 20 amino acid sequence identical to the N-terminus of the mature lectin polypeptide at the N-terminues of the deduced amino acid sequence of the cDNA shows that the isolated cDNA clone encodes EEA.
No putative signal peptide could be identified in the deduced sequence indicating that the protein is synthesized on free ribosomes. After synthesis, the first 6 residues are apparently removed from the primary translation product. In silico analyses predict a cytoplasmic location of the Euonymus lectin.
To check for the presence of intron(s), a genomic sequence corresponding to the EEA gene was amplified and sequenced. Alignment of the genomic and cDNA sequence demonstrated that the lectin gene contains three introns (Suppl. Fig. S1).

EEA recognizes two classes of structurally different glycans
A reinvestigation of the carbohydrate binding specificity of EEA using glycan array screening experiments confirmed its interaction with blood group B substance as previously described (Petryniak et al., 1977;Petryniak andGoldstein,1987, Teneberg et al., 2003), but at the same time also revealed a previously unobserved interaction with N-linked, high mannose type glycans. The binding of fluorescent-labeled EEA to glycans on the microarray is shown in Supplementary Fig. S2  . EEA also has specificity for N-linked, high mannose type glycans (192)(193)(197)(198) as shown in Table 1 at concentrations of 10-200 µg/ml.
Since other linear oligomannosides on the array showed no binding, the binding of EEA towards N-linked glycans apparently requires the core pentasaccharide (Manα1,3(Manα1,6)Manβ1,4GlcNAcβ1,4GlcNAc). To assess the relative affinity of EEA for blood group B oligosaccharides and high mannose type N-glycans the binding assays were carried out at decreasing lectin concentrations to reveal the higher affinity structures.
At 50 µg/ml and 10 µg/ml EEA the blood group B structures demonstrated highest affinity for EEA while, the fluorescence values for the high mannose N-glycans were roughly 10fold lower (Table 1). These data indicate that EEA has a much higher affinity for blood group B oligosaccharides than for high mannose N-glycans.
Although the results of the glycan array screening experiments are only semi-quantitative, they indicate that EEA binds two structurally unrelated glycans. To determine if the lectin possibly possesses two independent binding sites with different specificities, the glycan array screening experiment was repeated in the presence of inhibitory oligosaccharides.
The inhibition data are graphically presented in Supplemental Fig. S3 and summarized in

EEA shares high sequence similarity with a domain found in some abscisic acid and salt stress-responsive rice proteins
Even though EEA cannot be classified into any of the currently known lectin families, it definitely shares a high sequence similarity with several other (hypothetical) plant proteins.
BLASTp searches with the deduced complete sequence of LECEEA revealed that the rice protein osr40g3 scored best (Expect value = 1e-28) sharing 46% and 62% sequence identity and similarity, respectively, with EEA within a 151 residue overlap ( Fig. 2A). Osr40g3 was identified as an abscisic acid and salt stress-responsive protein (Moons et al., 1997). The protein is encoded by the rice gene Os07g0684000 (NCBI annotation)/Os07g48500 (TIGR annotation). Four additional genes were identified in the rice genome three of which encode proteins comprising two in tandem arrayed domains equivalent to Osr40g3 (Fig. 2B).
Interestingly, these rice proteins are annotated as a 'Ricin B-related lectin domain containing protein'. This annotation is based on the presence in their sequence of two 'QXW' repeats, which are considered typical motifs of the ricin-B domain. However, it is questionable whether osr40g3 can be classified in the ricin-B family because according to BLASTp searches it shares no significant overall sequence similarity with any protein comprising a ricin-B domain. Moreover, alignment of the amino acid sequences of osr40g3 and e.g. the B-chain of the Ricinus communis agglutinin (AAA33869.1) yields a very low sequence identity/similarity (Suppl. Fig. S4).
Besides the 5 members of the rice OSR40 family, 20 other plant proteins were retrieved by the BLASTp searches (E-value<0.1; Suppl. Table S2). One of these proteins is a wheat ortholog of Osr40g3. Another is a putative Osr40g3 homolog from Arabidopsis thaliana.
This Arabidopsis protein (At2g39050)   Similar BLAST searches in non-plant protein, genome and transcriptome databases did not yield a single positive hit indicating that the EEA domain is absent from other Eukaryota (e.g. animals and fungi) as well as from Prokaryota. Accordingly, one can reasonably conclude that the EEA domain is confined to plants.

CONCLUSIONS
A reinvestigation of the E. europaeus agglutinin indicated that the previously reported molecular structure has to be revised. In addition, glycan array screening revealed that EEA interacts with two structurally unrelated glycans namely the blood group B oligosaccharide and high mannose N-glycans. Molecular cloning demonstrated that EEA cannot be classified into any of the currently known (plant) lectin families but shares a high sequence similarity with a domain found in some previously identified abscisic acid and salt stressresponsive rice proteins. Although no similar lectins have been isolated yet, searches in the databases leave no doubt that all Spermatophyta express one or more proteins comprising either a single or two in tandem domains equivalent to the EEA subunit. We therefore propose that EEA represents a novel family of proteins that are apparently ubiquitous in Spermatophyta. Moreover, since no homologous genes/proteins are present in other eukaryotes or in prokaryotes the EEA lectin family can be considered plant-specific. At present, the physiological role of the EEA family remains unclear. It has been proposed that the rice OSR40 protein family plays a role in the adaptive response of roots to a hyperosmotic environment and most probably has structural functions (Moons et al., 1995;1997).
The latter assumption was based primarily on the presence at the N-terminus of some OSR40 proteins of a histidine-rich sequence that was believed to mediate protein-proteininteractions. Evidently, the finding that the OSR40 proteins contain one or two EEA domains sheds new light on their function in the plant cells because they might be

Purification of the Euonymus europaeus agglutinin (EEA)
EEA was purified from arillus tissue using a combination of conventional protein purification techniques and affinity chromatography. Seeds were collected from local protocols (Pacak and Kocourek, 1975;Petryniak et al., 1977), no precipitation occurred during dialysis, indicating that the ion exchange and gel filtration chromatography steps effectively removed some interfering compounds from the crude extract. Final purification of the lectin was achieved by affinity chromatography. The lectin fraction was mixed with an equal volume of 2 M ammonium sulfate and applied on a column of ovomucoid-Sepharose 4B (2.6 cm x 10 cm; 50 ml bed volume) equilibrated with 1 M ammonium sulfate. After loading, the column was washed with 1 M ammonium sulfate until the A280 fell below 0.01 and the bound lectin desorbed with 20 mM Tris-HCl (pH 10). The resulting affinity-purified lectin was dialysed against an appropriate buffer and used immediately or stored at -20°C until use. Following this procedure approximately 50 mg pure EEA was obtained from 100 g of dry arillus material with an overall recovery of roughly 75%.

Analytical methods
The purified lectin was analyzed by SDS-PAGE in a 4-12 % (w/v) Bis Tris acrylamide gel (Invitrogen, Carlsbad, CA) and visualized by staining with Coomassie brilliant blue.
Glycoproteins were distinguished after SDS-PAGE and electroblotting using periodic acid Schiff's staining following the instructions of Sigma-Aldrich. Alternatively, total neutral sugar was determined by the phenol/H 2 SO 4 method with D-glucose as standard (Dubois et al., 1956).
For N-terminal amino acid sequencing, the EEA polypeptides were separated by SDS-PAGE and electroblotted on a polyvinylidene difluoride membrane. Polypeptides were excised from the blots and sequenced on a model Procise 491cLC protein sequencer without alkylation of cysteines (Applied Biosystems, Foster City CA, USA).
Dynamic Light Scattering (DLS) measurements were carried out using a Zetasizer Nano S (Malvern Instruments, UK) equipped with a 633nm He-Ne laser and a temperaturecontrolled measuring chamber. Purified EEA at 0.45 mg/mL in distilled water was clarified by centrifugation for two hours at 16000 g and the supernatant was then subjected to dynamic light scattering measurements at 20 °C.

Glycan array screening
The microarrays are printed as described before (Blixt et al., 2004) and version 3.0 (see https://www.functionalglycomics.org/static/consortium/resources/resourcecoreh8.shtml) was used for the analyses reported here. Lyophilized lectin preparations are dissolved in PBS at 1 mg/ml and labeled with tetrafluorophenyl (TFP)-Alexa Fluor 488 using the Invitrogen protein labeling kit following the manufacturers instructions. Assuming an extinction coefficient of 1.85 for a 1.0 mg/ml solution, the molar ratios of Alexa488 to protein were 0.3 or 0.7 in two separate labelings. containg N-linked, high mannose oligosaccharides (Man 5-8 GlcNAc 2 ) obtained by pronase (Calbiochem, SanDiego, USA) digestion of RNaseB and affinity purification of the glycopeptides on a column of Con A as previously described (Lang et al., 1984).

RNA isolation and construction of a cDNA library
Total RNA was prepared from the arils of Euonymus europaeus as described by Van Damme and Peumans (1993). The plant material was ground to a fine powder in liquid nitrogen using a pre-chilled mortar and pestle, and extracted in 20 ml/g fresh weight cold homogenization buffer (100 mM Tris-HCl pH 9.0, 5 mM EDTA, 100 mM NaCl, 1% β mercaptoethanol). After centrifugation, SDS was added to a final concentration of 0.5%.

Screening of cDNA library
Clones were screened by colony hybridisation using a 32 P-end labeled synthetic oligonucleotide probe derived from the N-terminal amino acid sequence of the mature EEA polypeptide. In subsequent screenings a cDNA clone encoding the EEA was used as a probe, as described previously (Van Damme et al., 1996). The radioactive signal was visualized using the FujiFilm Fluorescent Image Analyzer FLA-5100 (FUJI, Dusseldorf, Germany).
Colonies that yielded positive signals were selected and rescreened at low density under the same conditions. Plasmids were isolated from purified single colonies on a miniprep scale using the QIAprep Spin MiniPrep kit (Qiagen, Venlo, The Netherlands) and sequenced at the VIB Genetic Service Facility (Antwerp, Belgium). DNA of a positive colony was purified and its sequence analysed.