|
|
||||||||
|
First published online May 27, 2005; 10.1104/pp.105.060244 Plant Physiology 138:935-948 (2005) © 2005 American Society of Plant Biologists Computational Identification of 69 Retroposons in Arabidopsis1,[w]National Center for Gene Research (Y.Z., Y.W., Y.L., B.H.), and Shanghai Institute of Plant Physiology and Ecology (Y.Z., B.H.), Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, China
Retroposition is a shot-gun strategy of the genome to achieve evolutionary diversities by mixing and matching coding sequences with novel regulatory elements. We have identified 69 retroposons in the Arabidopsis (Arabidopsis thaliana) genome by a computational approach. Most of them were derivatives of mature mRNAs, and 20 genes contained relics of the reverse transcription process, such as truncations, deletions, and extra sequence additions. Of them, 22 are processed pseudogenes, and 52 genes are likely to be actively transcribed, especially in tissues from apical meristems (roots and flowers). Functional compositions of these retroposon parental genes imply that not the mRNA itself but its expression in gamete cells defines a suitable template for retroposition. The presence/absence patterns of retroposons can be used as cladistic markers for biogeographic research. Effects of human and the Mediterranean Pleistocene refugia in Arabidopsis biogeographic distributions were revealed based on two recent retroposons (At1g61410 and At5g52090). An evolutionary rate of new gene creation by retroposition was calculated as 0.6 genes per million years. Retroposons can also be used as molecular fossils of the parental gene expressions in ancient time. Extensions of 3' untranslated regions for those expressed parental genes are revealed as a possible trend of plant transcriptome evolution. In addition, we reported the first plant functional chimeric gene that adapts to intercompartmental transport by capturing two additional exons after retroposition.
The scientific community has witnessed an accelerated advance toward understanding genomes and their evolutionary dynamics in recent years. Gene numbers among these organisms vary greatly, indicating the existence of a general process of new gene creation, a central theme for evolutionary diversity and speciation. Several molecular mechanisms have been known for this process (Long et al., 2003
Retroposition describes a process by which an RNA species was reverse transcribed, and the resultant double strand cDNA was then incorporated back into the genome (Brosius, 1991
With the availability of whole genome sequences, extensive analysis has been done in human, mouse, Drosophila, and Caenorhabditis elegans, and many retroposons have been identified by experimental or computational methods in recent years (Gonçalves et al., 2000
Annotation and Functional Classification All Arabidopsis genes were clustered according to protein sequence similarities. Those with fewer exons in the same cluster were selected as candidates to inspect the presence of three retropositional hallmarks: (1) lack of introns in the coding region compared with their paralogs; (2) remnant of the poly-(A) tail at the 3' end; and (3) target site duplications (TSDs; 5 bp < length < 15 bp) generated by retroposon integration. Finally, 59 retroposons from multiple exonic gene families as well as 4 genes (At3g11810, At4g03290, At5g49490, and At5g66830) from clusters of single exonic genes were identified. In addition, six single exonic singletons were confirmed by comparison with their rice (Oryza sativa) orthologs that have more than four exons, giving rise to a total of 69 retroposons in current Arabidopsis genome (Columbia ecotype [Col-0]; Table I; the detailed sequence annotation information can be found in Supplemental File 1). Fifty-one parental genes were also determined (with some parental genes created more than one retrogenes) according to phylogenetic tree topology, conservation in untranslated regions (UTRs), or synonymous substitution rate (Ks) of each retroposon with its paralogs as described in "Materials and Methods." Figure 1 shows the gene structure of a recent retroposon (At1g63760) with regard to its parental gene (At1g05890).
As evidence of the retropositional event, 16 retroposons have at least one conserved UTR longer than 50 bp. For example, At3g60610 has a 178-bp 3' UTR that shows 90.302 nucleotide identity with its parental gene as well as a 61-bp conserved 5' UTR. We identified that both At1g03300 and At1g45100 are retroposons. Of these retroposons, 67 are intronless, and only two genes (At1g63210 and At5g56720) kept one intron each that were not spliced out from pre-mRNAs. In most cases, the remnants of poly-(A) tails and TSDs were hardly detected; only 15 retroposons have both predicted poly-(A)s and TSDs, while 27 retroposons (39%) have neither of them. Truncation of retroposons is a striking feature as a result of the low processivity of the reverse transcriptases, especially toward the 5' direction. Thirteen retroposons were truncated in coding regions in 5' direction with clear retropositional endpoints (Fig. 2A). We also identified six 3' truncated retroposons (Fig. 2B). For At1g63210, truncations toward both 5' and 3' directions were observed. Three of these 18 truncated genes kept TSDs, suggesting that the truncations might be the results of RT processes rather than genomic rearrangements subsequent to retropositions.
Another structural defect as internal sequence deletion was found. The seven internal exons of At4g38030 (from the fourth to the 10th in part, totally 1,130 bp) were completely deleted in its retroposed copy, At1g65210 (Fig. 2C). In addition, one case of extra sequence addition was observed in At3g58390, a retroposon from the Arabidopsis cell division gene pelota (PEL1, At4g27650). It was found that a 63-bp fragment in the coding sequence (CDS) of this retroposon was derived from mitochondrial ORF159 (AtMg01050; Fig. 2D). In total, there are 20 genes that have structural traits of the RT process (e.g. truncations, deletions, and/or extra sequence additions). Among these genes, 22 are processed pseudogenes (32%) that lost coding potential after integration, and subsequent mutations had accumulated in their ORFs that preclude their functionality. For example, sequence mutation from A to T introduced premature stop codon in At2g25500, while an insertion of a 150-bp AG rich (93%) fragment caused loss of function of At3g27720.
Most of these parental genes that gave rise to retroposons can be classified into four functional categories: 10 are involved in cell division, chromosome partition, or DNA repair; 15 are related to transcription or translation; 11 have DNA or protein binding activities; and the remaining 15 genes encoded enzymes that are involved in miscellaneous cellular processes. The average length and GC content of these retroposons in retroposed regions are 1,225 bp and 43.7%, respectively, close to the average values for the entire gene set of Arabidopsis (1,301 bp and 40.302, respectively; Arabidopsis Genome Initiative, 2000
We identified potentially expressed retroposons through BLASTN search against expressed sequence tag (EST) databases that exhibited Expressional profiling of those retroposons and their parental genes were carried out in four tissues (root, shoot, leaf, and flower) by use of the specific primers. It turned out that most of the parental genes were actively transcribed (45 out of 51 parental genes; Table I). Interestingly, no expressions of three parental genes (At5g58340, At1g18310, and At1g69090) were detected, but all of their corresponding retroposons (At1g15720, At5g15870, and At5g66830) have evidence of expression. In addition, an obvious tendency was observed that most of the expressed retroposons were active in both roots and flowers, tissues from apical meristems. RT-PCR results of four representative retroposons were shown in Figure 3.
Gene and Genome Organization
To illustrate the relationship between these retroposons and their parental genes, we mapped all of them on the Arabidopsis genome (Fig. 4). It seems that these retrogenes have randomly incorporated into the chromosomes, and no hotspots for insertion were detected. No genes were presented in the pericentromeric regions. A high density of retroposons in chromosome 5 (23 retroposons, 33%) and a relatively lower density of the retroposons in chromosome 4 (only four genes present) were also observed. Eleven retrogenes are located on the short arm of chromosome 5, representing one of the highest density regions. It is striking that there is a paucity of genes out of chromosome 2 that has only five parental genes. An exodus scenario was also remarkable for chromosome 4 (especially the long arm), which has the highest parental gene density but lowest retroposon density. No neighboring retroposons, as those in retroposons of human high-mobility group, were identified (Strichman-Almashau et al., 2003
To analyze the genomic context of integration sites, we calculated base compositions of the 250-bp flanking sequences on both sides of each retroposon. The average GC contents of the 5' and 3' flanking sequence were 37% and 33%, respectively, close to the average value of the overall GC content of the genome (34.7%; Arabidopsis Genome Initiative, 2000 Even though a majority of these retroposons were in intergenic spaces, some were located in proximity of genes in a tail-to-tail configuration. Seventeen retrogenes had incorporated into the 3' UTRs of their neighboring genes in opposite orientations within 500-bp distance, resulting in overlapped transcriptions in complementary directions. For example, At1g15720 has a 79-bp complementary 3' UTR with the downstream gene (At1g15730) in opposite orientations. For some of the retroposons, portions of 3' coding regions can only be transcribed as 3' UTRs of their downstream genes in a reverse direction. No expression of At5g28210 was detected, but the expression of the 3' UTRs of its downstream neighboring gene (At5g28220) in the reverse orientation (GenBank accession no. BT004260) was identified to have a 104-bp overlap with this retroposon.
Arabidopsis has been characterized as the most extensively duplicated and reorganized genome that has three rounds of genome-wide duplications (Bowers et al., 2003
Elucidation of the emergence pattern of retroposons will provide insight into the evolutionary dynamics of Arabidopsis. The calculated Ks of these retroposons ranged evenly from 0.037 to 1.764, implying a continuous retropositional process through evolution. Of these retrogenes, some have intronless counterparts in the corresponding genome segmental duplication blocks (Fig. 4), suggesting that the formation of them is earlier than the latest genome duplication event of about 14 to 83 million years ago (MYA; Bowers et al., 2003
We also identified 13 retrogenes that have intronless counterparts in rice genome. There are two possible origins for these genes: they have inserted in the common ancestor before differentiation of the two species, or they have evolved independently during the evolution of the two species (for their high transcription levels that are prone to RTs). To discriminate the two possibilities, we searched against genomes of another monocot, maize (Zea mays; Palmer et al., 2003
Of these retroposons, eight showed higher than 90% DNA sequence similarities with their founder genes (among which five retrogenes have Ks values less than 0.1), suggesting very recent retropositional events. These genes may not be presented in other Arabidopsis ecotype strains, if only the target site sequences remain at the corresponding loci. We screened 16 additional Arabidopsis ecotypes by PCRs. It turned out that four retroposons (At1g63760, At3g11810, At3g58390, and At3g60610) were presented in all of these ecotypes, and four Arabidopsis ecotypes (Ba-1, Br-0, Ler-0, and Mt-0) contained all of these recent retroposons as Col-0, while the other 12 ecotypes lacked some of the remaining four retrogenes (Table II). Surprisingly, we could not detect the band corresponding to At5g23600, an expressed tRNA 2' phosphotransferase pseudogene, in up to 12 ecotypes checked. Subsequent cloning and sequencing of fragments encompassing the retroposed region revealed polymorphisms in the 3' end where the reverse PCR primer was unfortunately located; in Col-0 as well as Ba-1, Br-0, Ler-0, and Mt-0, the retroposed region extended 52 bp downstream of the stop codon of the parental gene, but in the other 12 ecotypes, it ended 15 bp before the stop codon and had lost a 369-bp downstream fragment (Fig. 5B), implying different retroposon integration performances across ecotypes. With 5.2 MYA as an upper time limit of divergence between different ecotypes of Arabidopsis (Koch et al., 2000
At5g52090, the most recent retroposon, however, is an exception to the general retropositional process. It is not directly reverse transcribed by RTs, but coreverse transcribed with a helper sequence: a 7-bp microhomologous fragment between mRNA of At5g37150 (parent of At5g52090) and this helper sequence provides an anchor site for co-RT, then this sequence is integrated into the chromosome. It has been revealed that the helper sequence was a Katydid-At1 type terminal-repeat retrotransposons in miniature (TRIM) element (Witte et al., 2001
Even with the availability of Arabidopsis genome sequence, our knowledge about plant retropositional process is still scant because of the poor annotations of pseudogenes and the difficulties to detect highly divergent sequences. In this article, we report the identification of 69 retroposons in Arabidopsis genome by a computational approach with stringent parameters. There is perhaps an underestimation of the total number of retroposons in the genome. However, the goal of this work was not to be exhaustive but to minimize the false positive rate (retroposons confirmed by paralogs less than three exons that may result from an adjacent-exon-merge process and those ambiguous clusters of single exonic genes with no obvious poly-A relics) and to set up a blueprint of plant retroposons. In fact, most of the 69 genes are likely to be derivatives of mature mRNAs, and only two genes kept one intron each that were not spliced out from pre-mRNAs. Records of the RT process such as truncations, deletions, and extra sequence additions can be found in 20 of these genes, validating the authenticity of this identification. Characterization of retroposons provides us a unique perspective to understand genome and gene dynamics over millions of years of evolution. Four genes (At1g14680, At4g02630, At5g17630, and At5g63370) that were once assigned as segmental duplication products from multiple exonic paralogs were finally elucidated to be retroposons here. In addition, retroposon identification helped to improve sequence annotations, since original annotations for 14 retroposons and five parental genes had been corrected by correlating with retropositions.
In this work, an evolutionary rate of new gene creation by retroposition (0.6 retroposons per million years) in Arabidopsis was reached based on the presence/absence patterns of four recent retrogenes in different ecotypes. Assuming the evolutionary distance between Arabidopsis and rice as 170 to 235 MYA (Yang et al., 1999
Retroposons were in fact jumping genes like retrotransposons (Roos et al., 2004
Characterization of retroposons can be regarded as a genomic archaeological process; what you can excavate depends on what have successfully deposited in the genome through millions of years of evolution. We can no longer discern any retroposons unless structural hallmarks and reference genes (intron-containing paralogs or exactly the founder genes) were identified in the genome. However, of the 69 retroposons identified, only 38 had predicted poly-(A) tails, and 27 had neither recognizable poly-(A) tails nor TSDs. This posed the biggest obstacle to identify the authentic retrogenes. Generally, a nascent poly-(A) tail will tend to decrease in length with time due to slippage during DNA replication, and long homopolymeric runs of As will also lead to genetic instability (Symers et al., 2002
With rice genome as a reference, six single exonic singletons were characterized as retroposons by comparing with those rice orthologs more than four exons, suggesting the existence of orphan retroposons ("
However, not all intronless genes were retroposons. For example, the Arabidopsis genome has two nuclear genes that encoded the translation elongation factor EF-Tu (Fig. 5D): one multiple exonic isoform (At4g02930, 12 exons) encoded the mitochondrial precursor, and one intronless isoform (At4g20360) encoded the chloroplast precursor. Even though the overall DNA sequence similarity between them was 59%, the chloroplast precursor was not a retropositional copy of the multiple exonic gene. Molecular phylogenetic evidence has revealed that At4g20360 was in fact a nuclear transfer product of the chloroplast tufA gene in the green algal ancestor of land plants (Baldauf and Palmer, 1990
Logically, a gene that can generate processed copy must meet three prerequisites: (1) be highly expressed, for a higher probability to be transported into the nucleus and reverse transcribed; (2) can be expressed, at least in some stage, in the germ line cells (or apical meristem cells that finally differentiate into gamete cells in plants), so as to be fixed in the genome and successfully transmitted into the next generation; and (3) recruits an active promoter to avoid transcriptional silencing, being maintained through millions of years of evolution. This stipulates that retroposon parental genes should be widely expressed, highly conserved house-keeping genes as have been shown in this report. In fact, 45 out of 51 parental genes are highly expressed in examined tissues. These functional components reflect the nature of the retropositional process: not the sequence per se of the RNA molecule, but its expression level in gamete cells determines the probability of an RNA to be reverse transcribed. Virtually all types of mRNA are capable of retroposition (Brosius, 1999
Since retroposons are derivatives of mature mRNA emerged in various evolutionary periods, they can serve as molecular fossils to reflect their parental genes' expressions at the time when they are formed, and sequence comparisons will help to identify some variant splicing forms that are expressed only under specific conditions or are extinct (Strichman-Almashau et al., 2003 For At1g45100, sequence alignment indicated the existence of a 24-bp mini-exon in the parental gene. The identification of At2g45530, a putative C3HC4-type RING-finger family protein, revealed a 63-bp exon in its rice ortholog that is missed by all gene prediction programs. This exon is confirmed by RT-PCR in rice as well as sugarcane EST data (GenBank accession no. CA202140). Comparisons between At1g15720 and At5g58340 suggested different gene structures toward the 3' direction (Fig. 6A). Since only the retroposed copy can be actively transcribed, we changed the parental gene prediction according to its fossil record of expression.
Such comparisons can also help to identify three parental genes that have been disabled after producing retroposed copies. For example, the splicing acceptor site of the second intron of At5g37150 was wrongly predicted because of one T insertion in the third exon (Fig. 6B), while the last intron of At1g18310 was indeed a defective exon as confirmed by retroposon sequence as well as rape paralog data (GenBank accession no. AAP42646). In addition, alignments for two retroposons with their respective parental genes seemed to imply the existence of an aberrant splicing signal as TC-TC (Fig. 6C). We even noticed an extensive exon reshuffling in At1g13350 toward the 5' direction after retroposition. Such comparisons also revealed changes in 3' UTRs through evolution. Of the 18 retroposons that showed higher than 80% sequence similarities in retroposed regions with their respective expressed founder genes, there are seven genes whose retropositional 3' end points matched well with the parental genes' transcriptional stop sites, suggesting the transcriptional patterns of those parental genes have not been changed after retroposon creations. But for the remaining nine genes (except for two genes that were truncated toward 3'), extensions of the 3' UTRs of the parental genes were striking, from 90 to 300 bp according to EST data. These changes may imply possible alternative poly-(A) signals, but efforts to clone these parental mRNAs similar to retroposons failed. When all retroposons were checked, 28 extensions on 3' UTRs out of 51 founder genes were observed, suggesting that extension of 3' UTR, which plays a crucial role in posttranscriptional regulation of gene expression by modulating nucleocytoplasmic mRNA transport, translation efficiency, subcellular localization, or mRNA stability, may be a common mechanism of plant transcriptome evolution.
Retroposition represents a reverse flow of genetic information via RNA intermediates. It is a shot-gun approach of the genome to achieve functional innovation and thus evolutionary diversities by mixing and matching coding sequences with novel regulatory elements. Generally, the low proportion of coding region in a genome minimizes the chance of a successful retroposition event. In fact, considering the insertional mutagenesis nature of retroposons, those genes inserted into intragenic regions (especially inside exons) were prone to purification through evolution, leaving those intergenic retroposons (especially to a close proximity of genes) aside. As a result of the retropositional process, retroposons usually have traits of structural defects (such as truncations, deletions, additions, and premature stop codons) that preclude their functionality. Nevertheless, decisive advantages are also clear for retroposition over segmental duplication for its nature of juxtaposing already existent coding sequence with a different regulatory element. For segmental duplications, subsequent changes in the corresponding regulatory elements are required so as to generate different control regions and alter temporal or spatial expression patterns. In this point, retroposition has been viewed as sowing seeds of evolution for new gene origination, rather than just representing an evolutionary dead end (Brosius, 1991 In this work, we noticed some expressed retroposons that had changed their target signals through retroposition, and such intersignal exchange may have considerable impacts. For example, At4g16580 had lost the original chloroplast transit peptide because of 5' truncation after retroposition, while both retroposon and the parental gene have kept coding potential and transcriptional activities. We even identified one rice chimeric gene that had acquired a chloroplast transit peptide by capturing two additional exons; thus, the cytoplastic localized protein got an innovated function in chloroplast.
We calculated the ratio of nonsynonymous substitution rate (Ka) to Ks (Ka/Ks) of each retroposon with its parental gene as an indicator of selective constrains on the new retroposed gene. It is generally believed that pseudogenes evolved neutrally with no selective pressures, giving a Ka/Ks value of 1 (Ka = Ks; Torrents et al., 2003
In addition, 11 out of 22 processed pseudogenes have evidence of transcriptions. Recently, a new functional role of such expressed pseudogenes was revealed as ncRNA by regulating mRNA stability of its homologous parental gene in coding region (Hirotsune et al., 2003
Identification Strategies and Bioinformatics Methods
The Arabidopsis (Arabidopsis thaliana) genomic sequence and the total gene set were retrieved from Munich Information Center for Protein Sequences (MIPS; http://ftpmips.gsf.de/cress), and exon/intron structure of each gene was derived from alignment of each CDS with genomic sequence using AAT package (Huang et al., 1997
For each retroposon, exact homologous endpoints of matched region were derived from sequence alignment, and the presence of a poly-(A) tail was checked in a 500-bp region downstream of the 3' endpoint. A poly-(A) tract was defined as
For DNA PCRs on different Arabidopsis ecotypes, primers on conserved regions of the eight recent retroposons and their parental genes were designed. Total genomic DNA was prepared from light-grown Arabidopsis seedlings with DNeasy Plant Mini kit (Qiagen, Valencia, CA). For RT-PCRs, gene specific primers were designed for each retroposon and their parental genes. RNAs were extracted from tissue samples from roots, shoots, leaves, and flowers with the RNeasy Plant Mini kit (Qiagen). First strand cDNA was synthesized by SuperScript II RT (Invitrogen, Carlsbad, CA) at 42°C for 1 h, and PCR was carried out by Taq DNA polymerase (TaKaRa Biotechnology, Dalian, China) with the following program: an initial 95°C for 1 min followed by 30 cycles of 95°C for 30 s, 56°C for 30 s, and 72°C for 1 min. Considering the intronless nature of retroposons, total RNA was digested by RNase-free DNase to avoid any possible DNA contamination and ran controls including mRNA without being reverse transcribed. Information about PCR primers and Arabidopsis ecotypes used can be found in the supplemental tables.
We thank Dr. Hairong Wang for help in data collection and Yanlei Fu for taking care of Arabidopsis plants. We also thank Professor Daoxiu Zhou (Université Paris-sud XI) and two anonymous reviewers for their constructive suggestions on manuscript revision. We are greatly indebted to the Arabidopsis Biological Resource Center (ABRC) for providing seed stocks of different Arabidopsis ecotypes. Received January 26, 2005; returned for revision March 14, 2005; accepted March 17, 2005.
1 This work was supported by the Ministry of Sciences and Technology (grant nos. 2002AA2Z1003 and 2003AA222091), by the Chinese Academy of Sciences, by the Shanghai Municipal Commission of Sciences and Technology (grant no. 038019315), and by the National Natural Science Foundation of China (grant no. 30325014).
2 These authors contributed equally to the paper.
[w] The online version of this article contains Web-only data. Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.105.060244. * Corresponding author; e-mail bhan{at}ncgr.ac.cn; fax 862164825775.
Abbott RJ, Gomes MF (1989) Population genetic structure and outcrossing rate of Arabidopsis thaliana (L.) Heynh. Heredity 62: 411418[CrossRef][Web of Science]
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 33893402 Baldauf SL, Palmer JD (1990) Evolutionary transfer of the chloroplast tufA gene to the nucleus. Nature 344: 262265[CrossRef][Medline]
Baumbusch LO, Thorstensen T, Krauss V, Fischer A, Naumann K, Assalkhou R, Schulz I, Reuter G, Aalen RB (2001) The Arabidopsis thaliana genome contains at least 29 active genes encoding SET domain proteins that can be assigned to four evolutionarily conserved classes. Nucleic Acids Res 29: 43194333
Berkemeyer M, Scheibe R, Ocheretina O (1998) A novel, non-redox-regulated NAD-dependent malate dehydrogenase from chloroplasts of Arabidopsis thaliana L. J Biol Chem 273: 2792727933
Betrán E, Thornton K, Long M (2002) Retroposed new genes out of the X in Drosophila. Genome Res 12: 18541859
Blanc G, Wolfe KH (2004a) Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16: 16671678
Blanc G, Wolfe KH (2004b) Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell 16: 16791691 Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433438[CrossRef][Medline]
Brosius J (1991) Retroposons: seeds of evolution. Science 251: 753 Brosius J (1999) RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 238: 115134[CrossRef][Web of Science][Medline] Brosius J (2003) The contribution of RNAs and retroposition to evolutionary novelties. Genetica 118: 99116[CrossRef][Web of Science][Medline]
Comeron JM (1999) K-Estimator: calculation of the number of nucleotide substitutions per site and the confidence intervals. Bioinformatics 15: 763764
Deininger PL, Batzer MA (2002) Mammalian retroelements. Genome Res 12: 14551465 Drouin G, Dover GA (1987) A plant processed pseudogene. Nature 328: 557558[CrossRef] Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300: 10051016[CrossRef][Web of Science][Medline] Fink GR (1987) Pseudogenes in yeast? Cell 49: 56[CrossRef][Web of Science][Medline] Gilbert N, Lutz-Prigge S, Moran JV (2002) Genomic deletions created upon LINE-1 retrotransposition. Cell 110: 315325[CrossRef][Web of Science][Medline]
Gonçalves I, Duret L, Mouchiroud D (2000) Natural and structure of human genes that generate retropseudogenes. Genome Res 10: 672678
Harrison PM, Echols N, Gerstein MB (2001) Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. Nucleic Acids Res 29: 818830
Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M (2003) Identification of pseudogenes in the Drosophila melanogaster genome. Nucleic Acids Res 31: 10331037
Higgins D, Thompson J, Gibson T, Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 46734680 Hirotsune S, Yoshida N, Chen A, Garrett L, Sugiyama F, Takahashi S, Yagami K, Wynshaw-Boris A, Yoshiki A (2003) An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene. Nature 423: 9196[CrossRef][Medline] Huang X, Adams MD, Zhou H, Kerlavage AR (1997) A tool for analyzing and annotating genomic sequences. Genomics 46: 3745[CrossRef][Web of Science][Medline] Jurka J (2000) Repbase update: a database and an electronic journal of repetitive elements. Trends Genet 16: 418420[CrossRef][Web of Science][Medline]
Koch M, Haubold B, Mitchell-Olds T (2000) Comparative evolutionary analysis of the chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis and related genera. Mol Biol Evol 17: 14831498
Lamblin AF, Crow JA, Johnson JE, Silverstein KA, Kunau TM, Kilian A, Benz D, Stromvik M, Endre G, VandenBosch KA, et al (2003) MtDB: a database for personalized data mining of the model legume Medicago truncatula transcriptome. Nucleic Acids Res 31: 196201
Li Y, Darley CP, Ongaro V, Fleming A, Schipper O, Baldauf SL, McQueen-Mason SJ (2002) Plant expansins are a complex multigene family with an ancient evolutionary origin. Plant Physiol 128: 854864 Long M, Betrán E, Thornton K, Wang W (2003) The origin of new genes: glimpses from the young and old. Nat Rev Genet 4: 865875[Web of Science][Medline]
Martignetti JA, Brosius J (1993) BC200 RNA: a neural RNA polymerase III product encoded by a monomeric Alu element. Proc Natl Acad Sci USA 90: 1156311567
Martin W, Herrmann RG (1998) Gene transfer from organelles to the nucleus: how much, what happens, and why? Plant Physiol 118: 917
Martin W, Rujan T, Richly E, Hansen A, Cornelsen S, Lins T, Leister D, Stoebe B, Hasegawa M, Penny D (2002) Evolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus. Proc Natl Acad Sci USA 99: 1224612251
Minorsky PV (2001) The hot and the classic. Plant Physiol 126: 471472
Mladek C, Guger K, Hauser M-T (2003) Identification and characterization of the ARIADNE gene family in Arabidopsis. A group of putative E3 ligases. Plant Physiol 131: 2740
Palmer LE, Rabinowicz PD, O'Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR (2003) Maize genome sequencing by methylation filtration. Science 302: 21152117 Rogers J (1983) Retroposons defined. Nature 301: 460[Medline]
Roos C, Schmitz J, Zischler H (2004) Primate jumping genes elucidate strepsirrhine phylogeny. Proc Natl Acad Sci USA 101: 1065010654
Roy SW, Fedorov A, Gilbert W (2003) Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc Natl Acad Sci USA 100: 71587162 Sharbel TF, Haubold B, Mitchell-Olds T (2000) Genetic isolation by distance in Arabidopsis thaliana: biogeography and postglacial colonization of Europe. Mol Ecol 9: 21092118[CrossRef][Medline]
Strichman-Almashau LZ, Bustin M, Landsman D (2003) Retroposed copies of the HMG genes: a window to genome dynamics. Genome Res 13: 800812 Symers DE, Connelly C, Szak ST, Caputo EM, Cost GJ, Parmigiani G, Boeke J (2002) Human L1 retrotransposition is associated with genetic instability in vivo. Cell 110: 327338[CrossRef][Web of Science][Medline] The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796815[CrossRef][Medline]
Torrents D, Suyama M, Zdobnov E, Bork P (2003) A genome-wide survey of human pseudogenes. Genome Res 13: 25592567 Vander Zwan C, Brodie SA, Campanella JJ (2000) The intraspecific phylogenetics of Arabidopsis thaliana in worldwide populations. Syst Bot 25: 4759
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al (2001) The sequence of the human genome. Science 291: 13041351
Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, et al (2003) Enrichment of gene-coding sequences in maize by genome filtration. Science 302: 21182120
Witte C-P, Le QH, Bureau T, Kumar A (2001) Terminal-repeat retrotransposons in miniature (TRIM) are involved in restructuring plant genomes. Proc Natl Acad Sci USA 98: 1377813783 Yang Y-W, Lai K-N, Tai P-Y, Li W-H (1999) Rate of nucleotide substitution in angiosperm mitochondrial DNA sequences and dates of divergence between Brassica and other angiosperm lineages. J Mol Evol 48: 597604[CrossRef][Web of Science][Medline]
Zhang Z, Harrison PM, Liu Y, Gerstein M (2003) Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome. Genome Res 13: 25412558 This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY® | THE PLANT CELL | |
|---|---|---|---|