|
|
||||||||
|
First published online December 15, 2006; 10.1104/pp.106.087270 Plant Physiology 143:579-586 (2007) © 2007 American Society of Plant Biologists OPEN ACCESS ARTICLE
The Rice Kinase Database. A Phylogenomic Database for the Rice Kinome1,[OA]United States Department of Agriculture-Agricultural Research Service, Appalachian Fruit Research Station, Kearneysville, West Virginia, 25430 (C.D.); Department of Plant Pathology, University of California, Davis, California, 95616 (J.C., T.R., P.R.); and Institute for Genomic Research, Rockville, Maryland, 20853 (S.O.)
The rice (Oryza sativa) genome contains 1,429 protein kinases, the vast majority of which have unknown functions. We created a phylogenomic database (http://rkd.ucdavis.edu) to facilitate functional analysis of this large gene family. Sequence and genomic data, including gene expression data and protein-protein interaction maps, can be displayed for each selected kinase in the context of a phylogenetic tree allowing for comparative analysis both within and between large kinase subfamilies. Interaction maps are easily accessed through links and displayed using Cytoscape, an open source software platform. Chromosomal distribution of all rice kinases can also be explored via an interactive interface.
The presence of large gene families in plant and animal genomes, and the varying levels of functional redundancy associated with such families, creates a considerable challenge to the functional analysis of individual genes. For example, knockouts of a single gene within a gene family often produce little or no observable phenotype. Newer technologies such as RNAi provide enhanced capability to study gene families, as RNAi can be used to knock down multiple genes simultaneously. However, this technology does have practical limitations on the numbers of genes that can be simultaneously silenced and still requires rational selection of gene targets. In the absence of phenotypic information, functional information can be inferred from comparative genomic or systems biological studies that incorporate bioinformatic, genomic, gene expression, and proteomic data. These approaches are hampered by current database formats that typically permit displays of only one gene or one field at a time and are therefore not amenable to simultaneous comparisons of multiple data sets and/or multigene families. The scattered nature of genomic data across multiple databases creates additional challenges to data integration. A new field of study that is at least in part resolving these limitations is phylogenomics. Phylogenomics represents a merger between phylogenetics and genomics and puts genomic data in a phylogenetic context (K. Solander, personal communication). Phylogenetic trees provide a platform to sort and categorize genes into groups based on sequence similarity and are particularly valuable when studying large gene families. Consequently, phylogenetic trees provide a useful foundation for functional predictions based on limited phenotypic data. They also provide a context to identify members within gene families that have unique properties such as the presence of novel domains, functional motifs, or expression patterns. Thus, phylogenomic analyses can provide a more logical basis for rational selection of gene candidates for further detailed functional studies. One family of genes for which redundancy poses enormous challenges is protein kinases. Kinases comprise a highly conserved family of enzymes that control diverse cellular processes and are key components of virtually all biological systems. The high degree of similarity found between even diverse protein kinases and the ability to generate robust phylogenetic groupings makes this gene family an excellent candidate for phylogenomic studies.
Kinases are found both as cytoplasmic proteins and as domains within larger membrane-bound receptors. Sequencing of the rice (Oryza sativa) genome has enabled the characterization of the entire complement of rice kinases or kinome. Remarkably, the rice kinome contains 40% more kinases than Arabidopsis (Arabidopsis thaliana) and is 3 times larger than the human kinome (Shiu and Bleecker, 2001
Protein kinases have been classified into seven major phylogenetic groups (Manning et al., 2002 receptor kinases, and Raf kinases (TKL), and homologs of yeast (Saccharomyces cerevisiae) sterile 7, sterile 11, and sterile 20 kinases (STE). Like Arabidopsis, the rice kinome lacks obvious members of the Tyr kinases group. Seventy-five percent of all rice kinases (1,068) fall into the TKL group that includes the large interleukin-1 receptor-associated kinase (IRAK) family and includes both receptor and cytoplasmic kinases. Within the rice IRAK family, 69 subfamilies have been delineated based on phylogenetic analyses and organization of extracellular domains (Shiu et al., 2004
Gene expression and proteomic data for rice genes and proteins is growing at an exponential rate. The release of two new microarray platforms, Affymetrix (http://www.affymetrix.com/products/arrays/specific/rice.affx) and the National Science Foundation (NSF)-funded 45 K array (http://www.ricearray.org), should further accelerate data deposition. Likewise, massively parallel signature sequencing (MPSS) data, a powerful method to accurately determine the representation of transcripts within mRNA or regulatory small RNA populations, is also becoming increasingly available. Rice MPSS data is rapidly growing and is currently available for multiple tissues as well as abiotic and biotic stress treatments (http://mpss.udel.edu/rice/; Nakano et al., 2006 Here we report the creation of a publicly available rice kinase phylogenomic database. How the database was constructed and guidelines for its use are described as well as results from an initial global kinase expression analysis.
Kinase Sequences and Phylogram
Representative kinases from six kinase groups (STE, TKL, CMGC, AGC, CK1, and CAMK) were used to perform reiterative TBLASTN searches against three databases: National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/), The Institute for Genomic Research (TIGR; release 2; http://www.tigr.org/tdb/e2k1/osa1/), and Knowledge-based Oryza Molecular biological Encyclopedia full-length cDNA database (http://cdna01.dna.affrc.go.jp/cDNA/; Kikuchi et al., 2003
Kinase domains identified by the KinG server (http://hodgkin.mbu.iisc.ernet.in/
Kinase Sequence Data Sequence information including annotation, chromosome number, position of 5' and 3' ends, and source bacterial artificial chromosome clone were provided by TIGR. TIGR also provided information related to sequence and annotation quality, including sequencing status, availability of cDNA or expressed sequence tags, genes with matches to known transposable elements, and whether or not annotation of each sequence has been verified by Program to Assemble Spliced Alignments (http://www.tigr.org/software/genefinding.shtml). Kinase genomic, cDNA, and protein sequences in FASTA format are also listed and available for download.
Conserved kinase motifs, including kinase subdomains I (G-loop), II, III, VI, VII, and VIII, were extracted from global multiple alignments of all rice kinase domains. These motifs contain highly conserved residues required for catalytic activity and can serve as a predictor of kinase function. Kinase topology, including predicted transmembrane domains and N-myristoylation sites, as well as predicted functional domains, were provided by Mike Gribskov and are also available in PlantsP (http://plantsp.genomics.purdue.edu/; Tchieu et al., 2003
Kinase chromosomal distributions were determined and visualized using GenomePixelizer (Kozik et al., 2002
We searched all available transposon and T-DNA flanking sequences in the National Center for Biotechnology Information for matches to rice kinases using BLASTN. These consist primarily of deposits from two insertional sequencing projects (Miyao et al., 2003
Expression data for each rice kinase was extracted from the rice MPSS database (http://mpss.udel.edu/rice/; Nakano et al., 2006
Potential Arabidopsis homologs of all rice kinases were identified using BLASTP (http://www.ncbi.nlm.nih.gov/). Arabidopsis identification (ID) numbers for the highest-scoring hit and the associated BLAST E value are indicated.
The kinase interaction maps presented on RKD include combined data obtained from an NSF-funded high-throughput Y2H and TAP project. A total of 275 rice kinases were used as baits in the Y2H system to identify interacting proteins (X. Ding, unpublished data). Concurrently, the same 275 rice kinases were TAP tagged and transformed into rice. Stable transgenic rice plants expressing the TAP tagged kinases were used to isolate in vivo copurifying proteins. The identities of copurifying proteins for the first 45 TAP tagged kinases were determined using mass spectroscopy (Rohila et al., 2006
Because Y2H is an indirect screen in a heterologous system, additional evidence is needed to validate the biological relevance of these putative kinase interactors. Protein interactions identified using other methods (i.e. in vitro/in vivo coimmunoprecipitation, or TAP) or corroborative evidence from other biological systems adds validity to the physiological relevance. For example, we found that CK II
Links to the Tree Viewer, Interactome, and Chromosome Distribution maps are indicated on the home page. In the Tree Viewer page, genomic and functional genomic fields can be selected by checking each box (Fig. 3 ). Pressing submit displays the selected data adjacent to the tree. Arabidopsis kinases most similar to each rice kinase can also be displayed by selecting nearest Arabidopsis kinase to enable cross species comparisons. The spreadsheet format allows all data or user-defined subsets of data to be readily transferred into any database, such as Excel, for further analysis. Clicking on the gene ID link brings up a summary window showing all of the available data for that kinase, including an interactive Cytoscape protein-protein interaction map (Fig. 4 ). Links to the TIGR rice database and PlantsP can be displayed for easy navigation. These links provide simple navigation between all data display formats as well as complementary databases.
Kinases used in the interaction mapping study can be indicated on the phylogenetic tree by selecting the kinase interaction map icon, which provides a link to the interaction map displayed in the summary window (Fig. 4). Alternatively, the complete interaction maps can be viewed by clicking on the Interactome icon listed on the home page. The Interactome page displays the complete Cytoscape interaction map for all kinases. The YTH and TAP tagging studies are currently listed separately but will be integrated upon completion of the study. Chromosome maps are color coded according to kinase subfamily and kinases are represented as colored squares (Fig. 5 ). The color key is provided. Mousing over each square generates a pop up showing the ID of each kinase. Clicking on the square navigates the user back to the Tree Viewer page with the selected kinase highlighted in red.
Phylogenomic Analysis of Kinase Expression
A handful of plant receptor kinases are known to function as specific pathogen recognition receptors sometimes called disease resistance genes (for review, see Nürnberger and Kemmerling, 2006
To test this hypothesis and, at the same time, the utility of the RKD, we performed a global phylogenomic analysis of kinase expression using all available rice MPSS data (Nakano et al., 2006 For each kinase, NBS-LRR, or cytochrome P450 gene, the median expression level across all tissues was calculated. The values were normalized and indicated as TPM. Next, the median values for all kinases within each kinase group, family, or subfamily were averaged together. Similarly, median values for NBS-LRRs and P450s were also averaged together, respectively. The averages and SDs were plotted to assess the overall expression levels from each kinase clade (at the group, family, and subfamily levels; Fig. 6 ).
On the whole, IRAK kinases are expressed at lower levels than other kinase groups with the exception of the IRAK RLCK-VIII subfamily that shows the highest median expression levels of all kinases. IRAK RLCK-VIII includes homologs of tomato (Lycopersicon esculentum) Pti1, a known phosphorylation target of the Pto disease resistance gene (Zhou et al., 1995
The RKD was created to provide a logical format to analyze diverse sets of genomic information in a phylogenetic context. User-selected genomic and functional genomic fields can be displayed on a phylogenetic tree with links to chromosomal and protein-protein interaction maps. Rather than analyzing kinases one by one, the RKD allows simultaneous visualization of entire kinase groups, families, and subfamilies. This format allowed us to identify features of rice receptor kinases that are specifically associated with pathogen recognition (Dardick and Ronald, 2006
We thank M. Gribskov (Purdue University, West Lafayette) who assisted with the identification of rice kinase sequences and supplied protein domain data. R. Buell (The Institute for Genomic Research, Rockville, MD) provided helpful insights. We also thank M. Fromm (University of Nebraska, Lincoln) and W. Song (University of Florida, Gainesville) for their contributions of data before publication. Received July 24, 2006; accepted November 17, 2006; published December 15, 2006.
1 This work was supported by the National Institutes of Health (grant no. GM59962), by the National Science Foundation (grant no. 0217312), and by the National Institute of General Medical Sciences Division of Minority Opportunities in Research (Institutional Research and Academic Career Development Award to C.D. and grant no. K12GM00679). The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Pamela C. Ronald (pcronald{at}ucdavis.edu).
[OA] Open Access articles can be viewed online without a subscription. www.plantphysiol.org/cgi/doi/10.1104/pp.106.087270 * Corresponding author; e-mail pcronald{at}ucdavis.edu; fax 5307525674.
Chen X, Shang J, Chen D, Lei C, Zou Y, Zhai W, Liu G, Xu J, Ling Z, Cao G, et al (2006) A B-lectin receptor kinase gene conferring rice blast resistance. Plant J 46: 794804[CrossRef][ISI][Medline] Chern M, Fitzgerald HA, Canlas PE, Navarre DA, Ronald PC (2005) Overexpression of a rice NPR1 homolog leads to constitutive activation of defense response and hypersensitivity to light. Mol Plant Microbe Interact 18: 511520[ISI][Medline] Dardick CD, Ronald PC (2006) Plant and animal pathogen recognition receptors signal through non-RD kinases. PLoS Pathog 2: e2[CrossRef][Medline] Gietz RD, Graham KC, Litchfield DW (1995) Interactions between the subunits of casein kinase II. J Biol Chem 270: 1301713021 Kikuchi S, Satoh K, Nagata T, Kawagashira N, Doi K, Kishimoto N, Yazaki J, Ishikawa M, Yamada H, Ooka H, et al (2003) Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science 301: 376379 Kozik A, Kochetkova E, Michelmore RW (2002) GenomePixelizer: a visualization program for comparative genomics within and between species. Bioinformatics 18: 335336 Krupa A, Abhinandan KR, Srinivasan N (2004) KinG: a database of protein kinases in genomes. Nucleic Acids Res 32: 153155 Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S (2002) The protein kinase complement of the human genome. Science 298: 19121934 Meyers BC, Morgante M, Michelmore RW (2002) TIR-X and TIR-NBS proteins: two new families related to disease resistance TIR-NBS-LRR proteins encoded in Arabidopsis and other plant genomes. Plant J 32: 7792[CrossRef][ISI][Medline] Mindrinos M, Katagiri F, Yu GL, Ausubel FM (1994) The A. thaliana disease resistance gene RPS2 encodes a protein containing a nucleotide-binding site and leucine-rich repeats. Cell 78: 10891099[CrossRef][ISI][Medline] Miyao A, Tanaka K, Murata K, Sawaki H, Takeda S, Abe K, Shinozuka Y, Onosato K, Hirochika H (2003) Target site specificity of the Tos17 retrotransposon shows a preference for insertion within genes and against insertion in retrotransposon-rich regions of the genome. Plant Cell 15: 17711780 Nakano M, Nobuta K, Vemaraju K, Tej SS, Skogen JW, Meyers BC (2006) Plant MPSS databases: signature-based transcriptional resources for analyses of mRNA and small RNA. Nucleic Acids Res 34: D731D735 Nürnberger T, Kemmerling B (2006) Receptor protein kinases: pattern recognition receptors in plant immunity. Trends Plant Sci 11: 519522[CrossRef][ISI][Medline] Oldroyd GED, Staskawicz BJ (1998) Genetically engineered broad-spectrum disease resistance in tomato. Proc Natl Acad Sci USA 95: 1030010305 Rohila JS, Chen M, Chen S, Chen J, Cerny R, Dardick C, Canlas P, Xu X, Gribskov M, Kanrar S, et al (2006) Protein-protein interactions of tandem affinity purification-tagged protein kinases in rice. Plant J 46: 113[CrossRef][ISI][Medline] Sallaud C, Gay C, Larmande P, Bes M, Piffanelli P, Piegu B, Droc G, Regad F, Bourgeois E, Meynard D, et al (2004) High throughput T-DNA insertion mutagenesis in rice: a first step towards in silico reverse genetics. Plant J 39: 450464[CrossRef][ISI][Medline] Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13: 24982504 Shiu S, Bleecker AB (2001) Receptor-like kinases from Arabidopsis form a monophyletic gene family related to animal receptor kinases. Proc Natl Acad Sci USA 98: 1076310768 Shiu SH, Karlowski WM, Pan R, Tzeng YH, Mayer KF, Li WH (2004) Comparative analysis of the receptor-like kinase family in Arabidopsis and rice. Plant Cell 16: 12201234 Shpak ED, Berthiaume CT, Hill EJ, Torii KU (2004) Synergistic interaction of three ERECTA-family receptor-like kinases controls Arabidopsis organ growth and flower development by promoting cell proliferation. Development 131: 14911501 Song WY, Wang GL, Chen LL, Kim HS, Pi LY, Holsten T, Gardner J, Wang B, Zhai WX, Zhu LH, et al (1995) A receptor kinase-like protein encoded by the rice disease resistance gene, Xa21. Science 270: 18041806 Sun X, Cao Y, Yang Z, Xu C, Li X, Wang S, Zhang Q (2004) Xa26, a gene conferring resistance to Xanthomonas oryzae pv. oryzae in rice, encodes an LRR receptor kinase-like protein. Plant J 37: 517527[CrossRef][ISI][Medline] Suzaki T, Sato M, Ashikari M, Miyoshi M, Nagato Y, Hirano HY (2004) The gene FLORAL ORGAN NUMBER1 regulates floral meristem size in rice and encodes a leucine-rich repeat receptor kinase orthologous to Arabidopsis CLAVATA1. Development 131: 56495657 Tchieu JH, Fana F, Fink JL, Harper J, Nair TM, Niedner RH, Smith DW, Steube K, Tam TM, Veretnik S, et al (2003) The PlantsP and PlantsT functional genomics databases. Nucleic Acids Res 31: 342344 Zhou J, Loh YT, Bressan RA, Martin GB (1995) The tomato gene Pti1 encodes a serine/threonine kinase that is phosphorylated by Pto and is involved in the hypersensitive response. Cell 83: 925935[CrossRef][ISI][Medline]
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY | THE PLANT CELL | |
|---|---|---|---|