|
|
||||||||
|
Plant Physiology 132:1162-1176 (2003) © 2003 American Society of Plant Biologists Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology, Ghent University, B9000 Gent, Belgium (S.R., K.F., Y.V.d.P.); Laboratoire de Génétique et Physiologie du Développement, Equipe bioinformatique, Centre National de la Recherche Scientifique, Parc Scientifique de Luminy, F-13288 Marseille Cedex 9, France (M.L.); Department of Electrical Engineering (Electronics, Systems, Automatisation and Technology-Signals, Identification, System Theory, and Automation), Katholieke Universiteit Leuven, B3001 Heverlee, Belgium (K.M.); and Laboratoire Associé de l'Institut National de la Recherche Agronomique (France), Ghent University, K.L. Ledeganckstraat 35, B9000 Gent, Belgium (P.R.)
The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called "search by signal" methods) and the delineation of promoters by considering both sequence content and structural features ("search by content" methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5'-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be developed for the successful detection of "putative" CpG and CpNpG islands in plants.
Arabidopsis, and probably most plants, encode an exceptionally large number of DNA-binding proteins, potentially acting as transcription factors (TFs). In fact, more than 3,000 genes have been anticipated to be involved in transcription, more than one-half of which were expected to encode TFs (Arabidopsis Genome Initiative, 2000
A promoter region, as described above, presents a rather linear view of the
promoter. In reality, a supplementary layer of complexity is added by bringing
the TFs together on a promoter, by adopting a three-dimensional configuration,
enabling the interaction with other parts to activate the basal transcription
machinery (Fig. 1;
Buratowski; 1997
The three-way connection between methylation, gene activity, and chromatin
structure has been known for almost two decades. DNA methylation has been
shown to repress transcription initiation by interfering directly with the
binding of transcriptional activators or indirectly by binding proteins with
affinity for methylated DNA (Weber et
al., 1990
Much attention has been paid to investigate the modular structure of
regulatory regions that control the transcription of eukaryotic genes
(Dynan, 1989 All of these different levels of complexity have great repercussions on the in silico identification of binding sites and promoters. Here, we review current approaches (summarized in Fig. 2) to identify promoters and their regulatory elements.
Promoter Prediction
Unlike gene prediction (Mathé
et al., 2002
In 1997, Fickett and Hatzigeorgiou
(1997
On the one hand, sequence-based algorithms aim at identifying regulatory
regions and promoters based on their sequence composition compared with that
of non-promoters. Among others, Scherf et al.
(2000
PromoterInspector (Scherf et al.,
2000
McPromoter (Ohler et al.,
1999
Although the prediction tools hitherto developed can produce acceptable
results for certain species, none of them have been trained and adapted for
plants. For example, McPromoter is trained especially to analyze data of
fruitfly (Drosophila melanogaster) and has been used in the Genome
Annotation Assessment project (Reese et
al., 2000
A structural feature that has proven useful in the detection of promoters
in the human genome are the so-called CpG islands, i.e. regions that are rich
in CpGs, which are important because of their strong link with gene
regulation. In general, CpG-rich regions are methylated and are associated
with inactive DNA often linked to heterochromatin, gene silencing, and
pathogen control (Jeddeloh et al.,
1998
Although the functional significance of methylation appears to be similar
in humans and plants (Hershkovitz et al.,
1990
CpG islands are characterized by a locally increased GC percentage (GC%)
compared with local averages and by the presence of CpGs (and CpNpGs in
plants). The CpG dinucleotide, usually methylated at the fifth position on the
cytosine ring, is counter-selected and found much less frequently than
expected based on mononucleotide frequencies, for example, 5-fold lower in
genomes of vertebrates. This depletion is believed to result from accidental
mutations by deamination of 5-methylcytosine to thymine
(Sved and Bird, 1990
The original pragmatic definition of a CpG island in human sequences
considers a GC% higher than 50 and a ratio between observed and expected (o/e)
occurrence of CG dinucleotides of 0.6 over a window of 200 bp
(Gardiner-Garden and Frommer,
1987 A program in Perl was written that computes the GC content and the o/e ratios of CpG and CpNpG compared with local characteristics over a certain window size. By applying this program to the ARAPROM data set to extract potential CpG/CpNpG islands, we tested the effect of setting the cut-off values for the GC content and the o/e CpG/CpNpG ratios at different levels (39% to 52% with a stepwise increase of 0.5% for the GC content; 0.6% to 2.0% for the o/e CpG and CpNpG ratios with a stepwise increase of 0.1. The results of this analysis with a window size of 200 bp are shown graphically for CpG and CpNpG islands (Figs. 3 and 4, respectively). The first observation is that no CpG island is detected with the cut-off parameters tuned for humans, except for a few in coding exons. Both parameters, CG% and o/e CpG, appear to influence strongly the number of CpG islands detected and, depending on the position in the genome, to affect differently the number of CpGs found. That number found in the "promoter" region sharply increases while the GC% cut-off decreases (Fig. 3). In contrast, for coding exons, the landscape resembles more a plateau, with many CpG islands found already at much higher GC% values. Only at the lowest GC% values, more CpG islands are predicted in the "promoter" region than in the coding exons. In UTR exons, which show a landscape similar to that of coding exons, fewer CpG islands are found and introns, which show a landscape more similar to that of the "promoter" region, show the lowest number of CpG islands.
Regarding the CpNpG landscape (Fig. 4), the major observation is that the overall number of islands lies well below that of the CpG islands. In addition, the same differences in landscape hold, as observed for CpG islands between promoter (and introns), on the one hand, and coding exons (and UTR exons), on the other hand. Nevertheless, a striking difference is that for CpNpGs, the o/e CpNpG threshold has to be very low for those islands to be detected. In terms of number of genes associated with CpG/CpNpG islands, different parameter settings lead to very different figures (Table II).
This preliminary in silico analysis shows that prediction of promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, there is some hope that, based on such a classification, more sophisticated approaches can be developed to detect CpG and CpNpG islands in plants.
Regulatory Elements
As stated in the introduction, CAREs are short, conserved motifs of
approximately 5 to 20 nucleotides. Detection of CAREs in the promoter is not
self-evident, because such short motifs are statistically expected to occur at
random every few hundred base pairs. Therefore, the main problem lies in
discriminating "true" from "false" regulatory elements
(Blanchette and Sinha, 2001
Co-expressed genes can be identified through transcript profiling
techniques, such as microarrays (Brown and
Botstein, 1999 Because co-expressed genes tend to behave similarly, they are expected to be coregulated. Under the simplifying assumption that this coregulation occurs at the transcriptional level, co-expressed genes should contain similar cis-regulatory elements in their promoter regions. As a consequence, these yet unknown cis-regulatory elements will be statistically overrepresented in the intergenic regions of the co-expressed genes in comparison with their frequent occurrence in a set of unrelated sequences. This overrepresentation constitutes the general principle on which motif detection algorithms is based.
Usually, genes are part of more extensive gene families that have
originated through both speciation and duplication events. Homologous genes in
distinct species are called orthologs, whereas paralogs refer to homologous
genes that are found in the same genome and have been created through gene
duplication (Mindell and Meyer,
2001
To conceive a general method that can detect regulatory motifs is a great
challenge because of both the complexity and flexibility of the regulatory
mechanisms (see the introduction). An important distinction between the
different approaches used thus far to detect regulatory motifs lies in the
representation of the motif, i.e. the TF-binding site. The simplest
description for a motif is a string of characters (A, C, G, and T), extended
with the 11 IUPAC characters that represent partly unspecified or ambiguous
nucleotides, and is used in the string-based approaches, such as word
counting. A more sophisticated description is to represent a given motif by
describing it in a probabilistic manner in which a certain likelihood is
assessed for each nucleotide at a given position in the motif. An example of a
probabilistic representation is the position-weight matrix, where each column
corresponds to a position in the aligned binding sites and each row to a
nucleotide, as shown in Figure
5. The cells of these matrices contain a number indicating the
probability to find a given nucleotide at that particular position.
Alternatives to describe motifs in a probabilistic manner are the hidden
Markov models (Jarmer et al.,
2001
Counting all of the possible words that may occur across the different
promoter sequences is one of the simplest approaches to find CAREs in a set of
promoters. Among word-counting methods, enumerative and suffix-tree approaches
can be distinguished, the latter being an optimization of the former. Both
methods are string based: The DNA sequence is considered as text in which
oligonucleotides are represented as words or strings. For a given set of
promoter sequences, the frequency of each possible word of a defined length is
computed (Hutchinson, 1996
Once the frequencies of different words are calculated, the words that are
likely to be a "true" regulatory motif have to be differentiated
from those that are not. Therefore, in each of these word-counting methods,
the number of occurrences of a word needs to be compared with the expected
frequency in a set of non-related sequences, represented by a background
model, which is used to obtain an expected probability. The simplest way to
build a background model is by creating a set of randomly generated sequences,
based on the single nucleotide composition of the submitted sequence. More
sophisticated ways to generate a background model are based on Markov chain
statistics (Schbath et al.,
1995
Probabilistic motif detection aims at constructing a multiple alignment by locally aligning small conserved regions in a set of unaligned sequences. Here, we will focus on the matrix-based approaches to illustrate probabilistic motif detection procedures. All methods start from a random motif model, represented as a weight matrix and altered through a series of iterations by machine-learning algorithms that are aimed at finding the optimal score. The process of optimizing the score for a local alignment already tends to converge toward conserved motifs that occur frequently in the data set. The more advanced algorithms incorporate a background model to compensate for given motifs occurring at high frequencies because of compositions similar to those of the non-conserved parts of the sequence (the "background"). A motif in which the average nucleotide composition differs strongly from the background will be assigned a higher score. Implementations differ from each other in the way the background is represented, in how the score is calculated, and in how the optimization is performed. For motif detection algorithms that describe the motif by a weight matrix, expectation maximization and its stochastic variant, Gibbs sampling, are often used for optimization strategies.
The program CONSENSUS was one of the first algorithms that represented a
motif by a weight matrix (Hertz et al.,
1990
The expectation-maximization (EM) method (Stormo,
1988
Gibbs sampling-based strategies have originally been developed to detect
protein motifs but have been adapted later on to handle DNA sequences
(Neuwald et al., 1995
Adaptative quality-based clustering (De
Smet et al., 2002
The procedure that identifies regulatory elements based on a set of
orthologous sequences is named phylogenetic footprinting
(Koop, 1995
A promising novel algorithm has recently been published that identifies the
most conserved motifs among the input sequences as measured by a parsimony
score on the underlying phylogenetic tree
(Blanchette et al., 2002
The most obvious reason why motif detection algorithms fail is because of
their sensitivity to noise. All parts of a sequence that do not contain the
motif constitute noise in the context of motif detection. Moreover, because
sets of related sequences are usually based on other predictive tools, for
instance clustering, they are expected to contain sequences without any shared
motif. A decreasing signal-to-noise ratio exacerbates the identification of
statistically overrepresented motifs and increases the chance of finding false
positives. Probabilistic motif detection methods have been improved
considerably to cope with a large noise level. Current implementations, such
as AlignACE (Hughes et al.,
2000
Because regulatory motifs, in particular in higher eukaryotes, are
concentrated in modules, current research is focusing toward adapting motif
detection algorithms to retrieve dyads, i.e. motifs spaced by a fixed or
variable gap. Within the enumerative statistical methods, Sinha and Tompa
(2000
Vanet et al. (2000
The need for extensive parameter fine tuning complicates nonexpert use of
most of the motif detection approaches described above. Novel implementations
of motif detection algorithms tackle this problem by estimating the optimal
parameter settings themselves, hence, minimizing the number of user-defined
parameters. An example of such a user-defined parameter is the motif length.
Because the motif length is generally unknown in advance, it is not obvious to
choose the parameter setting that results in the true motif. Some algorithms
compute the optimal motif length; for instance, Pattern assembly
(van Helden et al., 2000b
Promoters are very complex structures, defined by many different structural features. The actual regulatory elements are usually very short, which highly complicates their unambiguous identification. As a consequence, the in silico prediction of promoters and regulatory motifs is not straightforward. In addition, our knowledge of transcription regulation in general and organism-specific expression regulation in particular, is still very limited. Especially for plants, solid "intrinsic" genomic data are still needed that can be integrated into existing prediction tools. In this respect, we have started with the analysis of CpG and CpNpG islands, known to be often associated with promoters. Although several implementations for the detection of such "islands" in vertebrates have been described (Ioshikhes and Zhang, 2000
We thank two anonymous reviewers for helpful suggestions. Received November 14, 2002; returned for revision January 10, 2003; accepted March 17, 2003.
Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.102.017715.
1 This work was supported by the Vlaams Instituut voor de Bevordering van het
Wetenschappelijk-Technologisch Onderzoek (grant no. STWW980396). K.F.
is indebted to the Instituut voor de aanmoediging van Innovatie door
Wetenschap en Technologie in Vlaanderen for a predoctoral fellowship, K.M. is
Research Fellow of the Fund for Scientific Research (Flanders), and P.R. is a
Research Director of the Institut National de la Recherche Agronomique
(France).
2 These authors contributed equally to the paper. * Corresponding author; e-mail pierre.rouze{at}gengenp.rug.ac.be; fax 3292645349.
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796815[CrossRef][Medline]
Altschmied J, Delfgaauw J, Wilde B, Duschl J, Bouneau L, Volff
JN, Schartl M (2002) Subfunctionalization of duplicate
mitf genes associated with differential degeneration of alternative
exons in fish. Genetics 161:
259267 Antequera F, Bird A (1999) CpG islands as genomic footprints of promoters that are associated with replication origins. Curr Biol 9: R661R667[CrossRef][Web of Science][Medline] Aparicio S, Chapman J, Stupka E, Putnam N, Chia JM, Dehal P, Christoffels A, Rash S, Hoon S, Smit A et al. (2002) Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 23: 13011310 Ashikawa I (2001) Gene-associated CpG islands in plants as revealed by analyses of genomic sequences. Plant J 26: 617625[CrossRef][Web of Science][Medline]
Bagga R, Michalowski S, Sabnis R, Griffith JD, Emerson BM
(2000) HMG I/Y regulates long range enhancer-dependent
transcription on DNA and chromatin by changes in DNA topology. Nucleic
Acids Res 28:
25412550
Bajic V, Seah S, Chong A, Zhang G, Koh J, Brusic V
(2002) Dragon Promoter Finder: recognition of vertebrate RNA
polymerase II promoters. Bioinformatics
18:
198199 Bailey TL, Elkan C (1995) The value of prior knowledge in discovering motifs with MEME. Proc Int Conf Intell Syst Mol Biol 3: 2129[Medline] Baldi P, Chauvin Y, Brunak S, Gorodkin J, Pedersen AG (1998) Computational applications of DNA structural scales. Proc Int Conf Intell Syst Mol Biol 6: 3542[Medline]
Barton MC, Madani N, Emerson BM (1997) Distal
enhancer regulation by promoter derepression in topologically constrained DNA
in vitro. Proc Natl Acad Sci USA
94:
72577262
Beato M, Eisfeld K (1997) Transcription factor
access to chromatin. Nucleic Acids Res
25:
35593563 Bender J (2001) A vicious cycle: RNA silencing and DNA methylation in plants. Cell 106: 129132[CrossRef][Web of Science][Medline]
Bentin T, Nielsen PE (2002) In vitro
transcription of a torsionally constrained template. Nucleic Acids
Res 30:
803809 Berk AJ (1999) Activation of RNA polymerase II transcription. Curr Opin Cell Biol 11: 330335[CrossRef][Web of Science][Medline] Blanchette M, Sinha S (2001) Separating real motifs from their artifacts. Bioinformatics 17: 3038 Blanchette M, Schwikowski B, Tompa M (2002) Algorithms for phylogenetic footprinting. J Comput Biol 9: 211223[CrossRef][Web of Science][Medline]
Blanchette M, Tompa M (2002) Discovery of
regulatory elements by a computational method for phylogenetic footprinting.
Genome Res 12:
739748
Bolshoy A, McNamara P, Harrington RE, Trifonov EN
(1991) Curved DNA without A-A: experimental estimation of all 16
DNA wedge angles. Proc Natl Acad Sci USA
88:
23122316
Br
Breslauer KJ, Frank R, Blocker H, Marky LA
(1986) Predicting DNA duplex stability from the base sequence.
Proc Natl Acad Sci USA 83:
37463750
Breyne P, Dreesen R, Vandepoele K, De Veylder L, Van Breusegem
F, Callewaert L, Rombauts S, Raes J, Cannoot B, Engler G et al.
(2002) Transcriptome analysis during cell division in plants.
Proc Natl Acad Sci USA 99:
1482514830
Brower-Toland BD, Smith CL, Yeh RC, Lis JT, Peterson CL, Wang
MD (2002) Mechanical disruption of individual nucleosomes
reveals a reversible multistage release of DNA. Proc Natl Acad Sci
USA 99:
19601965 Brown PO, Botstein D (1999) Exploring the new world of the genome with DNA microarrays. Nat Genet 21: 3337[CrossRef][Web of Science][Medline] Brukner I, Sanchez R, Suck D, Pongor S (1995a) Sequence-dependent bending propensity of DNA as revealed by DNase I: parameters for trinucleotides. EMBO J 14: 18121818[Web of Science][Medline] Brukner I, Sanchez R, Suck D, Pongor S (1995b) Trinucleotide models for DNA bending propensity: comparison of models based on DNaseI digestion and nucleosome packaging data. J Biomol Struct Dyn 13: 309317[Web of Science][Medline] Buratowski S (1997) Snapshots of RNA polymerase II transcription initiation. Curr Opin Cell Biol 12: 320325
Bussemaker HJ, Li H, Siggia ED (2000a) Building
a dictionary for genomes: identification of presumptive regulatory sites by
statistical analysis. Proc Natl Acad Sci USA
97:
1009610100 Bussemaker HJ, Li H, Siggia ED (2000b) Regulatory element detection using a probabilistic segmentation model. Proc Int Conf Intell Syst Mol Biol 8: 6774[Medline] Cao X, Jacobsen SE (2002) Locus-specific control of asymmetric and CpNpG methylation by the DRM and CMT3 methyltransferase genes. Proc Natl Acad Sci USA 99: 1649116498
Cao X, Springer NM, Muszynski MG, Phillips RL, Kaeppler S,
Jacobsen SE (2000) Conserved plant genes with
similarity to mammalian de novo DNA methyltransferases. Proc Natl Acad
Sci USA 97:
49794984 Cardon LR, Stormo GD (1992) Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments. J Mol Biol 223: 159170[CrossRef][Web of Science][Medline]
Colinas J, Birnbaum K, Benfey PN (2002) Using
cauliflower to find conserved non-coding regions in Arabidopsis. Plant
Physiol 129:
451454
Coward E (1999) Shufflet: shuffling sequences
while conserving the k-let counts. Bioinformatics
15:
10581059
Crothers DM (1998) DNA curvature and
deformation in protein-DNA complexes: a step in the right direction.
Proc Natl Acad Sci USA 95:
1516315165 Davuluri RV, Grosse I, Zhang MQ (2001) Computational identification of promoters and first exons in the human genome. Nat Genet 29: 412417[CrossRef][Web of Science][Medline] de Boer GJ, Testerink C, Pielage G, Nijkamp HJ, Stuitje AR (1999) Sequences surrounding the transcription initiation site of the Arabidopsis enoyl-acyl carrier protein reductase gene control seed expression in transgenic tobacco. Plant Mol Biol 39: 11971207[CrossRef][Web of Science][Medline]
Dermitzakis ET, Clark AG (2002) Evolution of
transcription factor binding sites in mammalian gene regulatory regions:
conservation and turnover. Mol Biol Evol
19:
11141121
De Smet F, Mathys J, Marchal K, Thijs G, De Moor B, Moreau Y
(2002) Adaptive quality-based clustering of gene expression
profiles. Bioinformatics 18:
735746 Dorsett D (1999) Distant liaisons: long-range enhancer-promoter interactions in Drosophila. Curr Opin Genet Dev 9: 505514[CrossRef][Web of Science][Medline]
Down TA, Hubbard TJ (2002) Computational
detection and location of transcription start sites in mammalian genomic DNA.
Genome Res 12:
458461 Duret L, Bucher P (1997) Searching for regulatory elements in human noncoding sequences. Curr Opin Struct Biol 7: 399406[CrossRef][Web of Science][Medline]
Duret L, Galtier N (2000) The covariation
between TpA deficiency, CpG deficiency, and G+C content of human isochores is
due to a mathematical artifact. Mol Biol Evol
17:
16201625 Dynan WS (1989) Modularity in promoters and enhancers. Cell 58: 14[CrossRef][Web of Science][Medline] El Hassan MA, Calladine CR (1996) Propeller-twisting of base-pairs and the conformational mobility of dinucleotide steps in DNA. J Mol Biol 259: 95103[CrossRef][Web of Science][Medline]
Engelen K, Coessens B, Marchal K, De Moor B
(2003) MARAN: normalizing micro-array data.
Bioinformatics 19:
893894 Featherstone M (2002) Coactivators in transcription initiation: here are your orders. Curr Opin Genet Dev 12: 149155[CrossRef][Medline] Fessele S, Maier H, Zischek C, Nelson PJ, Werner T (2002) Regulatory context is a crucial part of gene function. Trends Genet 18: 6063[CrossRef][Web of Science][Medline]
Fickett JW, Hatzigeorgiou AG (1997) Eukaryotic
promoter recognition. Genome Res
7:
861878 Fickett JW, Wasserman WW (2000) Discovery and modeling of transcriptional regulatory regions. Curr Opin Biotechnol 11: 1924[CrossRef][Web of Science][Medline]
Finnegan EJ, Genger RK, Kovac K, Peacock WJ, Dennis ES
(1998a) DNA methylation and the promotion of flowering by
vernalization. Proc Natl Acad Sci USA
95:
58245829 Finnegan EJ, Genger RK, Peacock WJ, Dennis ES (1998b) DNA methylation in plants. Annu Rev Plant Physiol Plant Mol Biol 49: 223247[CrossRef][Web of Science] Finnegan EJ, Kovac KA (2000) Plant DNA methyltransferases. Plant Mol Biol 43: 189201[CrossRef][Web of Science][Medline] Finnegan EJ, Peacock WJ, Dennis ES (2000) DNA methylation, a key regulator of plant development and other processes. Curr Opin Genet Dev 10: 217223[CrossRef][Web of Science][Medline]
Force A, Lynch M, Pickett FB, Amores A, Yan Y-l, Postlethwait
J (1999) Preservation of duplicate genes by complementary,
degenerative mutations. Genetics
151:
15311545 Gardiner-Garden M, Frommer M (1987) CpG islands in vertebrate genomes. J Mol Biol 196: 261282[CrossRef][Web of Science][Medline]
Ghosh D (2000) Object-oriented transcription
factors database (ooTFD). Nucleic Acids Res
28:
308310
Gidekel M, Jimenez B, Herrera-Estrella L (1996)
The first intron of the Arabidopsis thaliana gene coding for
elongation factor 1
Goodsell DS, Dickerson RE (1994) Bending and
curvature calculations in B-DNA. Nucleic Acids Res
22:
54975503 Gorin AA, Zhurkin VB, Olson WK (1995) B-DNA twisting correlates with base-pair morphology. J Mol Biol 247: 3448[CrossRef][Web of Science][Medline] Grabe N (2002) AliBaba2: context specific identification of transcription factor binding sites. In Silico Biol 2: S11[Medline]
GuhaThakurta D, Stormo GD (2001) Identifying
target sites for cooperatively binding factors. Bioinformatics
17:
608621
Hampson S, Kibler D, Baldi P (2002)
Distribution patterns of overrepresented k-mers in noncoding yeast
DNA. Bioinformatics 18:
513528 Hannenhalli S, Levy S (2001) Promoter prediction in the human genome. Bioinformatics 17: S90S96[Abstract] Hardison RC (2000) Conserved noncoding sequences are reliable guides to regulatory elements. Trends Genet 16: 369372[CrossRef][Web of Science][Medline] Hershkovitz M, Gruenbaum Y, Renbaum P, Razin A, Loyter A (1990) Effect of CpG methylation on gene expression in transfected plant protoplasts. Gene 94: 189193[CrossRef][Medline]
Hertz GZ, Hartzell GW III, Stormo GD (1990)
Identification of consensus patterns in unaligned DNA sequences known to be
functionally related. Comput Appl Biosci
6:
8192 Hertz GZ, Stormo GD (1996) Escherichia coli promoter sequences: analysis and prediction. Methods Enzymol 273: 3042[CrossRef][Web of Science][Medline]
Hertz GZ, Stormo GD (1999) Identifying DNA and
protein patterns with statistically significant alignments of multiple
sequences. Bioinformatics 15:
563577
Heyer LJ, Kruglyak S, Yooseph S (1999)
Exploring expression data: identification and analysis of coexpressed genes.
Genome Res 9:
11061115
Higo K, Ugawa Y, Iwamoto M, Korenaga T (1999)
Plant cis-acting regulatory DNA elements (PLACE) database: 1999.
Nucleic Acids Res 27:
297300 Ho PS, Ellison MJ, Quigley GJ, Rich A (1986) A computer aided thermodynamic approach for predicting the formation of Z-DNA in naturally occurring sequences. EMBO J 5: 27372744[Web of Science][Medline] Hughes AL (1994) The evolution of functionally novel proteins after gene duplication. Proc R Soc Lond B 256: 119124[Medline] Hughes JD, Estep PW, Tavazoie S, Church GM (2000) Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 296: 12051214[CrossRef][Web of Science][Medline]
Hutchinson GB (1996) The prediction of
vertebrate promoter regions using differential hexamer frequency analysis.
Comput Appl Biosci 12:
391398 Inamdar NM, Ehrlich KC, Ehrlich M (1991) CpG methylation inhibits binding of several sequence-specific DNA-binding proteins from pea, wheat, soybean and cauliflower. Plant Mol Biol 17: 111123[CrossRef][Web of Science][Medline] International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature 409: 860921[CrossRef][Medline]
Ioshikhes IP, Trifonov EN, Zhang MQ (1999)
Periodical distribution of transcription factor sites in promoter regions and
connection with chromatin structure. Proc Natl Acad Sci USA
96:
28912895 Ioshikhes IP, Zhang MQ (2000) Large-scale human promoter mapping using CpG islands. Nat Genet 26: 6163[CrossRef][Web of Science][Medline]
Jarmer H, Larsen TS, Krogh A, Saxild HH, Brunak S, Knudsen S
(2001) Sigma A recognition sites in the Bacillus
subtilis genome. Microbiology
147:
24172424
Jeddeloh JA, Bender J, Richards EJ (1998) The
DNA methylation locus DDM1 is required for maintenance of gene
silencing in Arabidopsis. Genes Dev
12:
17141725
Jegga AG, Sherwood SP, Carman JW, Pinski AT, Phillips JL,
Pestian JP, Aronow BJ (2002) Detection and
visualization of compositionally similar cis-regulatory element
clusters in orthologous and coordinately controlled genes. Genome
Res 12:
14081417
Jensen LJ, Knudsen S (2000) Automatic discovery
of regulatory patterns in promoter regions based on whole cell expression data
and functional annotation. Bioinformatics
16:
326333 Johnson PF, McKnight SL (1989) Eukaryotic transcriptional regulatory proteins. Annu Rev Biochem 58: 799839[CrossRef][Web of Science][Medline] Jones PA (1999) The DNA methylation paradox. Trends Genet 15: 3437[CrossRef][Web of Science][Medline] Juo ZS, Chiu TK, Leiberman PM, Baikalov I, Berk AJ, Dickerson RE (1996) How proteins recognize the TATA box. J Mol Biol 261: 239254[CrossRef][Web of Science][Medline] Kass SU, Landsberger N, Wolffe AP (1997) DNA methylation directs a time-dependent repression of transcription initiation. Curr Biol 7: 157165[CrossRef][Web of Science][Medline]
Kleffe J, Borodovsky M (1992) First and second
moment of counts of words in random texts generated by Markov chains.
Comput Appl Biosci 8:
433441 Klingenhoff A, Frech K, Werner T (2002) Regulatory modules shared within gene classes as well as across gene classes can be detected by the same in silico. In Silico Biol 2: S1726[Medline] Koch MA, Haubold B, Mitchell-Olds T (2002) Comparative evolutionary analysis of chalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and related genera (Brassicaceae). Mol Biol Evol 17: 14831498
Koch MA, Weisshaar B, Kroymann J, Haubold B, Mitchell-Olds T
(2001) Comparative genomics and regulatory evolution:
conservation and function of the Chs and Apetala3 promoters.
Mol Biol Evol 18:
18821891
Kolchanov NA, Podkolodnaya OA, Ananko EA, Ignatieva EV,
Stepanenko IL, Kel-Margoulis OV, Kel AE, Merkulova TI, Goryachkovskaya TN,
Busygina TV (2000) Transcription regulatory regions database
(TRRD): its status in 2000. Nucleic Acids Res
28:
298301
Kondrakhin YV, Kel AE, Kolchanov NA, Romashchenko AG, Milanesi
L (1995) Eukaryotic promoter recognition by binding sites for
transcription factors. Comput Appl Biosci
11:
477488 Koop BF (1995) Human and rodent DNA sequence comparisons: a mosaic model of genomic evolution. Trends Genet 11: 367371[CrossRef][Web of Science][Medline] Kooter JM, Matzke MA, Meyer P (1999) Listening to the silent genes: transgene silencing, gene regulation and pathogen control. Trends Plant Sci 4: 340347[CrossRef][Web of Science][Medline] Kornberg RD, Lorch Y (2002) Chromatin and transcription: Where do we go from here? Curr Opin Genet Dev 12: 249251[CrossRef][Web of Science][Medline]
Krivan W, Wasserman WW (2001) A predictive
model for regulatory sequences directing liver-specific transcription.
Genome Res 11:
15591566 Langst G, Becker PB (2001) Nucleosome mobilization and positioning by ISWI-containing chromatin-remodeling factors. J Cell Sci 114: 25612568 Larkin JC, Oppenheimer DG, Pollock S, Marks MD (1993) Arabidopsis GLABROUS1 gene requires downstream sequences for function. Plant Cell 5: 17391748[Abstract]
Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF,
Wootton JC (1993) Detecting subtle sequence signals: a Gibbs
sampling strategy for multiple alignment. Science
262:
208214 Lawrence CE, Reilly AA (1990) An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins 7: 4151[CrossRef][Web of Science][Medline]
Lescot M, Déhais P, Thijs G, Marchal K, Moreau Y, Van de
Peer Y, Rouzé P, Rombauts S (2002) PlantCARE, a
database of plant cis-acting regulatory elements and a portal to
tools for in silico analysis of promoter sequences. Nucleic
Acids Res 30:
325327 Li G, Chandrasekharan MB, Wolffe AP, Hall TC (2001) Chromatin structure and phaseolin gene regulation. Plant Mol Biol 46: 121129[CrossRef][Web of Science][Medline]
Lindroth AM, Cao X, Jackson JP, Zilberman D, McCallum CM,
Henikoff S, Jacobsen SE (2001) Requirement of
CHROMOMETHYLASE3 for maintenance of CpXpG methylation.
Science 292:
20772080 Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ (1999) High density synthetic oligonucleotide arrays. Nat Genet 21: 2024[CrossRef][Web of Science][Medline] Liu XS, Brutlag DL, Liu JS (2002) An algorithm for finding protein DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol 20: 835839[Web of Science][Medline] Liu XS, Brutlag DL, Liu JS (2001) BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput 127138
Marilley M, Pasero P (1996) Common DNA
structural features exhibited by eukaryotic ribosomal gene promoters.
Nucleic Acids Res 24:
22042211 Marsan L, Sagot MF (2000) Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification. J Comput Biol 7: 345362[CrossRef][Web of Science][Medline]
Mathé C, Sagot MF, Schiex T, Rouzé P
(2002) Current methods of gene prediction, their strengths and
weaknesses. Nucleic Acids Res 30:
41034117 Meyer P (2000) Transcriptional transgene silencing and chromatin components. Plant Mol Biol 43: 221234[CrossRef][Web of Science][Medline] Meyer P, Niedenhof I, ten Lohuis M (1994) Evidence for cytosine methylation of non-symmetrical sequences in transgenic Petunia hybrida. EMBO J 13: 20842088[Web of Science][Medline] Meza TJ, Enerly E, Boru B, Larsen F, Mandal A, Aalen RB, Jakobsen KS (2002) A human CpG island randomly inserted into a plant genome is protected from methylation. Transgenic Res 11: 133142[CrossRef][Web of Science][Medline] Mindell DP, Meyer A (2001) Homology evolving. Trends Ecol Evol 16: 434440[CrossRef] Moreau Y, De Smet F, Thijs G, Marchal K, De Moor B (2002) Functional bioinformatics of microarray data: from expression to regulation. IEEE Proc 30: 17221743 Neuwald AF, Liu JS, Lawrence CE (1995) Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Sci 4: 16181632[Web of Science][Medline] Ng HH, Bird A (1999) DNA methylation and chromatin modification. Curr Opin Genet Dev 9: 158163[CrossRef][Web of Science][Medline]
Nikolov DB, Burley SK (1997) RNA polymerase II
transcription initiation: a structural view. Proc Natl Acad Sci
USA 94:
1522
Nikolov DB, Chen H, Halay ED, Hoffman A, Roeder RG, Burley
SK (1996) Crystal structure of a human TATA box-binding
protein/TATA element complex. Proc Natl Acad Sci USA
93:
48624867
Ohler U (2000) Promoter prediction on a genomic
scale: the Adh experience. Genome Res
10:
539542
Ohler U, Harbeck S, Niemann H, Noth E, Reese MG
(1999) Interpolated Markov chains for eukaryotic promoter
recognition. Bioinformatics 15:
362369 Ohler U, Niemann H (2001) Identification and analysis of eukaryotic promoters: recent computational approaches. Trends Genet 17: 5660[CrossRef][Web of Science][Medline] Ohler U, Niemann H, Liao GC, Rubin GM (2001) Joint modeling of DNA sequence and physical properties to improve eukaryotic promoter recognition. Bioinformatics 17: S199S206[Abstract] Ohler U, Liao GC, Niemann H, Rubin GM (2000) Computational analysis of core promoters in the Drosophila genome. Genome Biol 3: 0087.10087.12 Oki M, Kamakaka RT (2002) Blockers and barriers to transcription: competing activities. Curr Opin Cell Biol 14: 299304[CrossRef][Web of Science][Medline]
Olson WK, Gorin AA, Lu XJ, Hock LM, Zhurkin VB
(1998) DNA sequence-dependent deformability deduced from
protein-DNA crystal complexes. Proc Natl Acad Sci USA
95:
1116311168
Panstruga R, Buschges R, Piffanelli P, Schulze-Lefert P
(1998) A contiguous 60 kb genomic stretch from barley reveals
molecular evidence for gene islands in a monocot genome. Nucleic Acids
Res 26:
10561062 Pedersen AG, Baldi P, Chauvin Y, Brunak S (1998) DNA structure in human RNA polymerase II promoters. J Mol Biol 281: 663673[CrossRef][Web of Science][Medline] Pedersen AG, Baldi P, Chauvin Y, Brunak S (1999) The biology of eukaryotic promoter prediction: a review. Comput Chem 23: 191207[CrossRef][Web of Science][Medline]
Pesole G, Liuni S, D'Souza M (2000) PatSearch:
a pattern matcher software that finds functional elements in nucleotide and
protein sequences and assesses their statistical significance.
Bioinformatics 16:
439450 Pitto L, Cernilogar F, Evangelista M, Lombardi L, Miarelli C, Rocchi P (2000) Characterization of carrot nuclear proteins that exhibit specific binding affinity towards conventional and nonconventional DNA methylation. Plant Mol Biol 44: 659673[CrossRef][Web of Science][Medline]
Ponger L, Mouchiroud D (2002) CpGProD:
identifying CpG islands associated with transcription start sites in large
genomic mammalian sequences. Bioinformatics
18:
631633 Pradhan S, Urwin NA, Jenkins GI, Adams RL (1999) Effect of CWG methylation on expression of plant genes. Biochem J 341: 473476
Praz V, Perier R, Bonnard C, Bucher P (2002)
The Eukaryotic Promoter Database, EPD: new entry types and links to gene
expression data. Nucleic Acids Res
30:
322324 Prestridge DS (1991) SIGNAL SCAN: A computer program that scans DNA sequences for eukaryotic transcriptional elements. CABIOS 7: 203206 Prestridge DS (1995) Predicting Pol II promoter sequences using transcription factor binding sites. J Mol Biol 249: 923932[CrossRef][Web of Science][Medline] Prince VE, Pickett FB (2002) Splitting pairs: the diverging fates of duplicated genes. Nat Rev Genet 3: 827837[CrossRef][Web of Science][Medline]
Quiros CF, Grellet F, Sadowski J, Suzuki T, Li G, Wroblewski
T (2001) Arabidopsis and Brassica comparative genomics:
sequence, structure and gene content in the
ABI1-Rps2-Ck1 chromosomal segment and related
regions. Genetics 157:
13211330 Razin A (1998) CpG methylation, chromatin structure and gene silencing: a three-way connection. EMBO J 17: 49054908[CrossRef][Web of Science][Medline]
Reese MG, Hartzell G, Harris NL, Ohler U, Abril JF, Lewis SE
(2000) Genome annotation assessment in Drosophila
melanogaster. Genome Res 10:
483501
Reymond P, Weber H, Damond M, Farmer EE (2000)
Differential gene expression in response to mechanical wounding and insect
feeding in Arabidopsis. Plant Cell
12:
707720 Richards EJ, Elgin SC (2002) Epigenetic codes for heterochromatin formation and silencing: rounding up the usual suspects. Cell 108: 489500[CrossRef][Web of Science][Medline]
Riechmann JL, Heard J, Martin G, Reuber L, Jiang CZ, Keddie J,
Adam L, Pineda O, Ratcliffe OJ, Samaha RR et al.
(2000) Arabidopsis transcription factors: genome-wide
comparative analysis among eukaryotes. Science
290:
21052110 Robertson KD (2002) DNA methylation and chromatin: unraveling the tangled web. Oncogene 21: 53615379[CrossRef][Web of Science][Medline] Robin S, Schbath S (2001) Numerical comparison of several approximations of the word count distribution in random sequences. J Comput Biol 8: 349359[CrossRef][Medline] Rooney JW, Sun YL, Glimcher LH, Hoey T (1995) Novel NFAT sites that mediate activation of the interleukin-2 promoter in response to T-cell receptor stimulation. Mol Cell Biol 15: 62996310[Abstract]
Rossi V, Motto M, Pellegrini L (1997) Analysis
of the methylation pattern of the maize opaque-2
(O2) promoter and in vitro binding studies indicate that the
O2 B-Zip protein and other endosperm factors can bind to methylated target
sequences. J Biol Chem 272:
1375813765 Roth FP, Hughes JD, Estep PW, Church GM (1998) Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nat Biotechnol 16: 939945[CrossRef][Web of Science][Medline] Sagot MF, Myers EW (1998) Identifying satellites and periodic repetitions in biological sequences. J Comput Biol 5: 539553[Medline]
Salgado H, Santos-Zavaleta A, Gama-Castro S, Millan-Zarate D,
Diaz-Peredo E, Sanchez-Solano F, Perez-Rueda E, Bonavides-Martinez C,
Collado-Vides J (2001) RegulonDB (version 3.2):
transcriptional regulation and operon organization in Escherichia
coli K-12. Nucleic Acids Res
29:
7274 Schbath S (1997) An efficient statistic to detect over- and under-represented words in DNA sequences. J Comp Biol 4: 189192 Schbath S (2000) An overview on the distribution of word counts in Markov chains. J Comput Biol 7: 193201[CrossRef][Medline] Schbath S, Prum B, de Turckheim E (1995) Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J Comput Biol 2: 417437[Medline] Scherf M, Klingenhoff A, Werner T (2000) Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach. J Mol Biol 297: 599606[CrossRef][Web of Science][Medline]
Seki M, Narusaka M, Kamiya A, Ishida J, Satou M, Sakurai T,
Nakajima M, Enju A, Akiyama K, Oono Y et al. (2002)
Functional annotation of a full-length Arabidopsis cDNA collection.
Science 296:
141145 Sinha S, Tompa M (2000) A statistical method for finding transcription factor binding sites. Proc Int Conf Intell Syst Mol Biol 8: 344354[Medline]
Sinha S, Tompa M (2002) Discovery of novel
transcription factor binding sites by statistical overrepresentation.
Nucleic Acids Res 30:
55495560 Sivolob AV, Khrapunov SN (1995) Translational positioning of nucleosomes on DNA: the role of sequence-dependent isotropic DNA bending stiffness. J Mol Biol 247: 918931[CrossRef][Web of Science][Medline] Sorensen MB, Muller M, Skerritt J, Simpson D (1996) Hordein promoter methylation and transcriptional activity in wild-type and mutant barley endosperm. Mol Gen Genet 250: 750760[Web of Science][Medline] Southern EM (2001) DNA microarrays: history and overview. Methods Mol Biol 170: 115[Medline] Stormo GD (1988) Computer methods for analyzing sequence recognition of nucleic acids. Annu Rev Biophys Biophys Chem 17: 241263[CrossRef][Web of Science][Medline] Stormo GD (1990) Consensus patterns in DNA. Methods Enzymol 183: 211221[Web of Science][Medline]
Stormo GD, Hartzell GW, 3rd (1989) Identifying
protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci
USA 86:
11831187 Struhl K (1999) Fundamentally different logic of gene regulation in eukaryotes and prokaryotes. Cell 98: 14[CrossRef][Web of Science][Medline]
Struhl K (2001) Gene regulation: a paradigm for
precision. Science 293:
10541055 Sturaro M, Viotti A (2001) Methylation of the Opaque2 box in zein genes is parent-dependent and affects O2 DNA binding activity in vitro. Plant Mol Biol 46: 549560[CrossRef][Web of Science][Medline]
Sugimoto N, Nakano S, Yoneyama M, Honda K
(1996) Improved thermodynamic parameters and helix initiation
factor to predict stability of DNA duplexes. Nucleic Acids Res
24:
45014505
Sved J, Bird A (1990) The expected equilibrium
of the CpG dinucleotide in vertebrate genomes under a mutation model.
Proc Natl Acad Sci USA 87:
46924696
Takai D, Jones PA (2002) Comprehensive analysis
of CpG islands in human chromosomes 21 and 22. Proc Natl Acad Sci
USA 99:
37403745
Thijs G, Lescot M, Marchal K, Rombauts S, De Moor B,
Rouzé P, Moreau Y (2001) A higher order
background model improves the detection of promoter regulatory elements by
Gibbs sampling. Bioinformatics
17:
11131122 Thijs G, Marchal K, Lescot M, Rombauts S, De Moor B, Rouzé P, Moreau Y (2002a) A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes. J Comput Biol 9: 447464[CrossRef][Web of Science][Medline]
Thijs G, Moreau Y, De Smet F, Mathys J, Lescot M, Rombauts S,
Rouzé P, De Moor B, Marchal K, Déhais P et al.
(2002b) INCLUSive: INtegrated Clustering, Upstream sequence
retrieval and motif Sampling. Bioinformatics
18:
331332
Thompson JD, Higgins DG, Gibson TJ (1994)
CLUSTAL W: improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position-specific gap penalties and
weight matrix choice. Nucleic Acids Res
22:
46734680 Tjian R, Maniatis T (1994) Transcriptional activation: a complex puzzle with a few easy pieces. Cell 77: 58[CrossRef][Web of Science][Medline]
Tompa M (2001) Identifying functional elements
by comparative DNA sequence analysis. Genome Res
11:
11431144 Travers A, Drew H (1997) DNA recognition and nucleosome organization. Biopolymers 44: 423433[CrossRef][Web of Science][Medline] Tsunoda T, Takagi T (1998) Estimating transcription factor bindability on DNA. Bioinformatics 15: 622630
Vanet A, Marsan L, Labigne A, Sagot MF (2000)
Inferring regulatory elements from a whole genome. An analysis of
Helicobacter pylori Vanet A, Marsan L, Sagot MF (1999) Promoter sequences and algorithmical methods for identifying them. Res Microbiol 150: 779799[Medline] van Helden J, Andre B, Collado-Vides J (1998) Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281: 827842[CrossRef][Web of Science][Medline]
van Helden J, del Olmo M, Perez-Ortin JE
(2000a) Statistical analysis of yeast genomic downstream
sequences reveals putative polyadenylation signals. Nucleic Acids
Res 28:
10001010
van Helden J, Rios AF, Collado-Vides J (2000b)
Discovering regulatory elements in non-coding sequences by analysis of spaced
dyads. Nucleic Acids Res 28:
18081818 Vaucheret H, Fagard M (2001) Transcriptional gene silencing in plants: targets, inducers and regulators. Trends Genet 17: 2935[CrossRef][Web of Science][Medline]
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG,
Smith HO, Yandell M, Evans CA, Holt RA et al (2001)
The sequence of the human genome. Science
291:
13041351
Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T, Hornes M,
Frijters A, Pot J, Peleman J, Kuiper M et al (1995)
AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res
23:
44074414 Waibel AH, Hanazawa T, Hinton GE, Shikano K, Lang KJ (1989) Phoneme recognition using time-delay neural networks. IEEE Trans Acoustic Speech Signal Process 37: 328339[CrossRef] Wasserman WW, Palumbo M, Thompson W, Fickett JW, Lawrence CE (2000) Human-mouse genome comparisons to locate regulatory sites. Nat Genet 26: 225228[CrossRef][Web of Science][Medline] Weber H, Ziechmann C, Graessmann A (1990) In vitro DNA methylation inhibits gene expression in transgenic tobacco. EMBO J 9: 44094415[Web of Science][Medline] Werner T (2000) Computer-assisted analysis of transcription control regions: Matinspector and other programs. Methods Mol Biol 132: 337349[Medline]
Wingender E, Chen X, Hehl R, Karas H, Liebich I, Matys V,
Meinhardt T, Pruss M, Reuter I, Schacherer F (2000)
TRANSFAC: an integrated system for gene expression regulation. Nucleic
Acids Res 28:
316319
Wingender E, Dietze P, Karas H, Knuppel R
(1996) TRANSFAC: a database on transcription factors and their
DNA binding sites. Nucleic Acids Res
24:
238241
Wolfertstetter F, Frech K, Herrmann G, Werner T
(1996) Identification of functional elements in unaligned nucleic
acid sequences by a novel tuple search algorithm. Comput Appl
Biosci 12:
7180
Wolffe AP, Matzke MA (1999) Epigenetics:
regulation through repression. Science
286:
481486 Workman CT, Stormo GD (2000) ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. Pac Symp Biocomput 467478
Zhang MQ (1998) Identification of human gene
core promoters in silico. Genome Res
8:
319326
Zhang SH, Lawton MA, Hunter T, Lamb CJ (1994)
atpk1, a novel ribosomal protein kinase gene from
Arabidopsis: I. Isolation, characterization, and expression. J
Biol Chem 269:
1758617592
Zhu J, Liu JS, Lawrence CE (1998) Bayesian
adaptive sequence alignment algorithms. Bioinformatics
14:
2539
Zhu J, Zhang MQ (1999) SCPD: a promoter
database of the yeast Saccharomyces cerevisiae.
Bioinformatics 15:
607611 This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY® | THE PLANT CELL | |
|---|---|---|---|