- Copyright © 1998 American Society of Plant Physiologists
Abstract
The root hair is a specialized cell type involved in water and nutrient uptake in plants. In legumes the root hair is also the primary site of recognition and infection by symbiotic nitrogen-fixingRhizobium bacteria. We have studied the root hairs ofMedicago truncatula, which is emerging as an increasingly important model legume for studies of symbiotic nodulation. However, only 27 genes from M. truncatulawere represented in GenBank/EMBL as of October, 1997. We report here the construction of a root-hair-enriched cDNA library and single-pass sequencing of randomly selected clones. Expressed sequence tags (899 total, 603 of which have homology to known genes) were generated and made available on the Internet. We believe that the database and the associated DNA materials will provide a useful resource to the community of scientists studying the biology of roots, root tips, root hairs, and nodulation.
Roots provide structural and physiological support for plant interactions with the soil environment. Epidermal root hairs are specialized cells with a high surface-to-volume ratio that enables them to perform an important role in transport of water, ions, and nutrients. Root hairs are outgrowths of trichoblasts, and elongate by tip growth, a distinct mode of plant cell growth shared only by pollen tubes (Peterson and Farquhar, 1996). The patterning, differentiation, and growth of root hairs has been elucidated by genetic and cell biological studies (Di Cristina et al., 1996; Galway et al., 1997;Sanchez-Fernandez et al., 1997; Schneider et al., 1997), but many aspects of root-hair cell function remain unknown. In the Fabaceae, root hairs are also significant in having active responses toRhizobium and related bacteria during the early stages of the symbiosis that leads to nitrogen fixation (Brewin, 1991; Hirsch, 1992). The identification of root proteins important for transport, cell growth and differentiation, and interaction with microbes is therefore of interest for studies in numerous plant species. However, because root hairs are small and often transient structures, direct biochemical analysis is challenging.
Medicago truncatula has emerged as an important experimental plant species both for studying nodulation by Rhizobium and for investigating mycorrhizal associations. M. truncatula is nodulated by R. meliloti, a bacterial species well characterized with respect to genetics and biochemistry. M. truncatula is autogamous and has a relatively small diploid genome and a number of genetically distinguishable ecotypes. These properties contribute to its suitability for molecular genetic analyses (Barker et al., 1990; Blondon et al., 1994; Cook et al., 1997). Genetic screens for plants with altered symbiotic phenotypes and various molecular approaches have identified several interesting mutations and genes (Benaben et al., 1995; Cook et al., 1995; Sagan et al., 1995; Gamas et al., 1996; Harrison, 1996; Peng et al., 1996; Burleigh and Harrison, 1997; Penmetsa and Cook, 1997).
Single-pass sequencing of cDNAs randomly picked from a library of genes made from a tissue of interest offers a complementary approach to biochemical and genetic analysis (Adams et al., 1991). ESTs generated by such an effort are compared with databases of identified genes. The results of these comparisons are used as a guide to assign putative identifications to the cDNAs. Researchers may then search or browse through the putative identifications of the ESTs to determine which genes may be of interest for further study. An investigator can also compare an experimentally obtained partial protein or nucleic acid sequence with the EST database to find out whether the cDNA for the gene has already been cloned. These methods have led to the rapid identification of genes in a number of organisms and have accelerated research by providing genetic material for further investigation (Adams et al., 1991, 1993; McCombie et al., 1992; Okubo et al., 1992; Newman et al., 1994; Rounsley et al., 1996).
In this paper we describe the collection of ESTs from a root-hair-enriched root-tip cDNA library from M. truncatula. We have constructed a website for access to the resulting database of 899 sequences. The sequence information and clones resulting from this effort may provide a useful tool for researchers studying general root physiology and molecular biology, and should have particular applications to the molecular biology of the R. meliloti-M. truncatula symbiosis.
MATERIALS AND METHODS
Tissue Collection
Sprouted seeds of Medicago truncatula cv Jemalong (Purkiss Seeds, Armidale, Australia) were grown overnight on 25 × 25-cm agar plates of Nod3 medium (2 mmCaSO4, 1 mmMgSO4, 0.5 mmK2HPO4, 1 mm2-[N-morpholino]ethane sulfonic acid, pH 6.5, Murashige and Skoog minor salts without KI, and 11.5 g/L purified agar [Sigma] plus 1 μmAg2SO4). Growth in enclosed plates was necessary to maintain sterility. However, under such conditions M. truncatula seedlings are sensitive to accumulated ethylene, an inhibitor of nodulation. Adding Ag2SO4, an inhibitor of plant responses to ethylene, to the growth medium blocked the effect of ethylene on nodulation (P.A. Covitz, unpublished results), and presumably reduced the likelihood that nodulation-related genes were improperly expressed due to ethylene.
Seedlings were placed on a sterile, wet paper towel with roots dangling 2 to 3 cm over one long edge, and the towel was rolled. The exposed root tips were dipped in liquid nitrogen and broken off. In one preparation these root tips with intact root hairs were used directly for RNA isolation. In another preparation root hairs were isolated by stirring in liquid nitrogen on a magnetic stir plate for 15 to 30 min until they broke off. Stripped root tips fell to the bottom of the container, whereas detached root hairs floated and were collected by decanting the liquid nitrogen (Rohm and Werner, 1987).
RNA Isolation
Tissue was ground under liquid nitrogen using a mortar and pestle and transferred to a centrifuge tube containing 8 mL of hot (70°C) borate extraction buffer (0.2 m sodium borate, 1% SDS, and 30 mm EGTA) per gram of tissue. An equal volume of Tris-EDTA-saturated phenol:chloroform (1:1, v/v), pH 9.0, was added. The mixture was vortexed for 1 min and put on ice for 5 min before homogenization with a polytron (model Kinematica PT 1200, Brinkmann Instruments, Westbury, NY). The sample was centrifuged and the aqueous layer was re-extracted three times with phenol:chloroform (1:1, v/v) and once with chloroform only. RNA was isolated by differential precipitation in 2 m LiCl followed by reprecipitation with ethanol.
cDNA Library Construction
Total RNA from root hairs (287 μg) and intact root tips with root hairs (663 μg) was pooled and sent to Stratagene for poly(A+) RNA selection and construction of a root-hair-enriched root tip cDNA library. First-strand cDNA synthesis used an oligo-dT linker-primer with a XhoI cloning site. The 5′ end of each cDNA was ligated to an adaptor with anEcoRI-compatible overhang. cDNA was ligated unidirectionally into the EcoRI and XhoI sites of the λ-ZAP Express vector (Stratagene), packaged in vitro, and amplified. The amplified library represents approximately 106 recombinants.
Sequencing
The phage library was converted to the plasmid form by mass excision according to the procedure described by Stratagene. Amplified library lambda phage were co-infected with M13-derived ExAssist helper phage into Escherichia coli strain XL1-Blue MRF′, and the bacteria were grown for 2.5 h. The culture supernatant containing single-stranded phagemid form of the library was used to infectE. coli strain XLOLR. The bacteria were grown for 75 min and then used directly for double-stranded plasmid DNA preparation. Plasmid library DNA was electroporated into E. coli strain XL1-Blue, and the bacteria were plated at low density on medium containing Luria-Bertani broth, tetracycline (10 mg L−1), and kanamycin (25 mg L−1) after an outgrowth time of 40 min. Individual colonies were selected randomly for plasmid DNA purification and sequencing.
Template purification and sequencing of the plasmid cDNAs was performed either by the authors or by commercial DNA sequencing service providers (e.g. Bio-101 [Vista, CA] and Lark Technologies [Houston, TX]). All sequencing reactions contained the standard T3 sequencing primer, and thus read into the presumed 5′ end of each cDNA. Reactions were run and analyzed on either capillary or slab-gel electrophoresis automated sequencing machines (Perkin-Elmer). These machines generate two computer files for each sequencing run: a chromatogram file and a plain text file.
Sequence Editing
All stages of data analysis and assembly were performed on Macintosh operating system-based computers. The sequence text files were edited to remove leading vector and trailing, poor-quality sequence using the Java-based computer program SeqTrim, which was written for this work (P.A. Covitz, unpublished program). SeqTrim also flagged anomalous clone sequences that were then edited manually after examination of their corresponding chromatogram files.
Homology Comparisons and Database Construction
Edited EST sequences were entered into FileMaker Pro (Claris, Santa Clara, CA), a relational database. Each EST was translated in all six reading frames and compared with the nonredundant database at the National Center for Biotechnology Information (NCBI) using the BLASTX program. Default BLAST parameter values were used except for the following settings: Expect = 1, Alignments = 3, and Descriptions = 10. Sequences that returned no significant homology were again compared using BLASTN with Expect = 0.1, Alignments = 3, and Descriptions = 10. The results of the comparisons were incorporated into the FileMaker database. Homologies to negative reading frames were disregarded, except in clones with inserts in the reverse orientation. Putative identifications for the ESTs were assigned based on the results of the BLAST searches and in some cases with information contained in related abstracts in MEDLINE. WebStar (Quarterdeck) and Tango for FileMaker (Everyware) are being used to display the FileMaker database to users on the World Wide Web.
RESULTS
Single-Pass Sequencing of Random cDNAs
Root-hair cells are the site of infection by R. meliloti, yet they represent only a small proportion of the total mass of tissue in the root. We postulated that genes uniquely or preferentially expressed in root hairs may be critical for symbiotic recognition. We therefore constructed a cDNA library from root tissue enriched for root hairs to increase the proportionate representation of root-hair-specific genes. The RNA source material for the library was derived primarily from the infection zone of the root and contained growing root hairs, fully differentiated root hairs, root tips including the meristem, and root-cap cells (see Methods). Thus, all of the major cell types and processes related to the early stages of symbiosis were represented.
We constructed the library by unidirectional insertion of oligo-dT-primed cDNAs into λ-ZAP-Express. The phage library was converted through mass excision to a plasmid library in the vector pBK-CMV. Individual clones from the plasmid library were picked at random and sequenced using a standard T3 primer that was annealed and extended into the 5′ side of the inserted cDNA. Of 958 sequencing reactions attempted, 919 produced some length of readable sequence.
The text file resulting from each sequencing reaction was edited using both automated and, if necessary, manual methods. SeqTrim was used to remove leading vector and trailing, poor-quality sequences from each file. In cases in which the sequencing reaction extended past the poly(A+) tail of the gene, the sequence was trimmed to remove the 3′ vector and linker sequences. SeqTrim also flagged sequences derived from anomalous clones. These anomalies generally fell into three classes: (a) clones with altered adaptor sequence or incorrect base calls at the junction between the vector and the cDNA insert; (b) clones with inserts in the reverse orientation; or (c) clones without inserts altogether. The files for these three classes were edited manually; sequences from the third class were removed from further consideration.
Putative Identification of Genes
Each EST was compared against all sequences in the nonredundant database at the NCBI using the program BLASTX, which compares translated nucleotide sequences with protein sequences. Sequences that had no homology to any protein in the database were then reanalyzed using the program BLASTN, which compares nucleotide sequences with nucleotide sequences. The results of each comparison were screened manually. Sequences deemed to be of bacterial origin were removed from the collection. After screening and editing we retained 899 ESTs. Statistical information about the collection is shown in TableI.
EST collection statistics
Of the 899 ESTs, 603 had significant homology to previously identified genes. Although the BLAST scores and P values were considered, the assessment of whether a given homology was significant was determined by investigator judgment, not by absolute numerical cutoffs. The annotations of genes with similarities to an EST were used to assign a putative identification to that EST. In cases in which the annotation was vague, information contained in MEDLINE abstracts related to the gene was used to assign a putative identification. The 603 ESTs with similarity to known genes represent 356 distinct genes, as indicated by their putative identifications. These distinct identifications were grouped into 13 functional categories, which are listed in TableII.
EST putative identifications
Abundantly Expressed Genes
The relative abundance of the mRNA in a tissue is approximately reflected in the abundance of its corresponding cDNA in non-normalized libraries. Random sequencing of cDNAs therefore yields information about the relative expression levels of the genes represented by the ESTs (Adams et al., 1993). Table IIIlists the genes from the five most abundant mRNAs in our library. Two of these, Met synthase and elongation factor 1-61, are critical for protein metabolism, whereas β-glucosidase is involved in cell wall development. The abundant expression of these genes reflects the actively growing state of the source tissue used to generate the library. The remaining two most abundantly expressed genes are members of the membrane intrinsic protein water channel family, a result that is consistent with the physiological role of the root in water uptake.
Most abundant mRNAs
Met synthase is also related to ethylene metabolism, so the abundance of the gene is intriguing because of the possibly complex role(s) of ethylene in root-hair development and symbiosis (Masucci and Schiefelbein, 1996; Dolan, 1997; Heidstra et al., 1997; Penmetsa and Cook, 1997). However, the presence of the ethylene-response inhibitor Ag2SO4 in the plant growth medium (see Methods) may have perturbed the expression of enzymes involved in ethylene metabolism. This possibility was not specifically tested in M. truncatula roots.
DISCUSSION
Genes of Interest
Our RNA preparation included actively growing and differentiating cells. We therefore expected a diverse representation of sequences from cytoskeleton, vesicle trafficking, and cell division functions. Representatives of these classes were found, along with a number of sequences that encode likely cell wall proteins and cell wall synthesis enzymes. An even larger number of gene products with homology to proteins involved in signal transduction were identified. Considering the dynamic growth of root hairs and the induction of root cortical cell division in response to R. meliloti cells and Nod factors, the proteins encoded by sequences in each of these categories might be especially interesting targets for cell biological manipulation and analysis.
Genes categorized under defense and secondary metabolism functions should also be appropriate targets of study given the possible involvement, or suppression, of host defenses during infection byRhizobium (Hirsch and Fang, 1994; Mellor and Collinge, 1995;Spaink, 1995). In particular, an endochitinase homolog may be intriguing to examine (EST 00194; serial nos. can be used to obtain detailed EST information at http://bio-SRL8.stanford.edu, as described below). This endochitinase EST does not have significant homology to a previously identified M. truncatula chitinase cDNA derived from roots infected with R. meliloti (accession no. Y10373; A. Niebel, unpublished data). The identification of two different chitinases in uninfected and infected roots permits more rigorous testing of the hypothesis that chitinases have a regulatory role in plant responses to lipooligosaccharide nodulation factors (Staehelin et al., 1994).
Several other sequences are especially interesting given their homology to genes with known functions in other systems. One intriguing sequence in the collection is a homolog of the mammalian eyes absent(eya) gene (EST 00777). The BLAST comparison of the EST with homology to this gene yielded high scoring pairs spanning 103 amino acids of human eya2. The P value of the match was 4.9e−14, suggesting that the alignment was not due to chance. The eya genes have been implicated in eye development in both mammals and insects, a finding that has challenged the long-held notion that eyes evolved independently in these two branches of animal phylogeny (Duncan et al., 1997; Zimmerman et al., 1997). The presence of an eya homolog in plants suggests that the history of this gene family may extend to a common ancestor of plants and animals.
Another sequence of particular significance is a His kinase (EST 00711) with the strongest similarity to a bacterial two-component regulator (P < 1.1e−26). Only a few His kinase sequences have been reported in plants, but these are of high interest because one, the ETR1 gene, has been identified by genetics as an ethylene receptor, and its product has been shown by direct biochemical studies to be an ethylene-binding protein (Chang et al., 1993; Schaller and Bleecker, 1995). Another sequence with His kinase homology is implicated by genetic tests as a possible cytokinin receptor (Kakimoto, 1996). Thus, the functionally characterized His kinase-like sequences reported in plants are possible receptors for chemical ligands. The appearance of a new His kinase in the root-hair-enriched library is important in light of the internal and external chemical signals that are operating in roots and their root hairs. The M. truncalata sequence appears to be distinct from ETR1 and other functionally characterized genes. We are pursuing such characterization of this newly identified His kinase homolog through construction of antisense and other transgenic plants.
We expected to find expressed sequences representing major root functions such as transport and cell growth and division; however, since no cells from other portions of the plant were present, we did not expect to find functions typically associated with shoot, leaf, or reproductive organs. Therefore, some EST sequences in the library are developmentally surprising. These include chlorophyll a- andb-binding protein (ESTs 00122, 00414, and 00460), an embryo-specific protein (EST 00339), and a pollen-specific protein (EST 00010).
Several genes with homology to sugar transporters were found. No homologs of the nitrate transporter or of other putative plasmalemma transporters were identified. The apparent paucity of transporter genes is somewhat puzzling. However, we note that more than one-fifth of the ESTs in the database show no significant homology to known genes. Transporter genes may be represented among these unclassified sequences. It is also possible that the putative identifications of the membrane transporters in Table II reflect their basic membrane transport functions, but do not necessarily identify the correct substrate.
Sequence similarity as indicated by BLAST does not always reflect actual conservation at the relevant functional sites of the proteins in question. Examination of individual sequence lineups and reference to all original papers should precede conclusions about the likely function of any particular expressed gene. The Web site described below is designed to facilitate this process.
Internet Access to Detailed EST Information
The entire collection of ESTs has been organized into an online database that is accessible via the World Wide Web athttp://bio-SRL8.stanford.edu. This Web site provides tools for browsing and searching the database. Each EST has a detailed record that includes the results and date of its BLAST comparison and links to additional information on the matching genes at the NCBI. This provides a way to examine the relationship of the putative homolog to the gene being queried. The raw sequence chromatogram files and chromatogram-viewing software are available for downloading at this site as well. The raw data should prove useful to researchers who want to confirm the base calls of a particular sequence, for example, to design primers. In addition, all of the EST sequences have been deposited in dbEST at the NCBI. An investigator who wishes to compare his or her own sequence with these M. truncatula ESTs can do so by performing a BLAST search with the NCBI server by selecting dbEST as the database to search against (http://www.ncbi.nlm.nih.gov/BLAST).
Uses for the ESTs
The EST data from the M. truncatula cDNA library described here can initially be used to create codon-usage tables and other data tables to assist in the establishment of M. truncatula as a model system for molecular genetic studies. The sequences can be used to generate probes to isolate genomic DNA containing the corresponding genes and to provide markers for physical maps. Gene-expression studies may identify genes with cell-type-specific or symbiotically regulated expression patterns. Once isolated from genomic DNA, the promoters of such genes may provide valuable reagents for transgenic promoter-fusion experiments. Other genes described here may be useful as controls for constitutive expression.
Root hairs are mechanically and optically accessible and their growth can be actively studied using a number of cell biological techniques and experimental manipulations (Dazzo et al., 1996; Ehrhardt et al., 1996; Galway et al., 1997). EST sequences that encode proteins of known or predicted function may be used to create peptide antigens for generating antibodies to be used in such studies. The protein products of cytoskeletal protein and cell wall enzyme genes may be good candidates for this approach.
Finally, the EST database may be of use to scientists who have biochemically purified proteins of interest from M. truncatula. The partial peptide sequence of a purified protein could be compared against translated EST sequences. If present, related and more extensive cDNAs could then be readily identified and used as tools for additional studies.
ACKNOWLEDGMENTS
We are grateful to Audrey Southwick for assistance in the preparation of plants and isolation of RNA for the construction of the library and to Melanie Ukanwa for assisting with the BLAST homology searches. We thank Michael Cherry and David Flanders for their advice on constructing the EST database, JoAnne Connelly for assistance with the manuscript, and members of our laboratory for numerous useful suggestions.
Footnotes
-
↵1 S.R.L. is an investigator of the Howard Hughes Medical Institute. Additional support for this project came from the Department of Energy, Energy Biosciences Program (contract no. DE-FG03-90ER20010). P.A.C. was a fellow of the Jane Coffin Childs Memorial Fund for Medical Research.
-
↵2 Present address: Incyte Pharmaceuticals, 3174 Porter Drive, Palo Alto, CA 94304.
-
↵* Corresponding author; e-mail fa.srl{at}forsythe.stanford.edu; fax 1–650–725–8309.
Abbreviations:
- EST
- expressed sequence tag
- Received December 30, 1997.
- Accepted May 4, 1998.