Plant Physiology 132:1162-1176 (2003)
© 2003 American Society of Plant Biologists
BIOINFORMATICS
Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes1
Stephane Rombauts2,
Kobe Florquin2,
Magali Lescot,
Kathleen Marchal,
Pierre Rouzé* and
Yves Van de Peer
Department of Plant Systems Biology, Flanders Interuniversity Institute
for Biotechnology, Ghent University, B9000 Gent, Belgium (S.R., K.F.,
Y.V.d.P.); Laboratoire de Génétique et Physiologie du
Développement, Equipe bioinformatique, Centre National de la Recherche
Scientifique, Parc Scientifique de Luminy, F-13288 Marseille Cedex 9, France
(M.L.); Department of Electrical Engineering (Electronics, Systems,
Automatisation and Technology-Signals, Identification, System Theory, and
Automation), Katholieke Universiteit Leuven, B3001 Heverlee, Belgium
(K.M.); and Laboratoire Associé de l'Institut National de la Recherche
Agronomique (France), Ghent University, K.L. Ledeganckstraat 35, B9000
Gent, Belgium (P.R.)
The identification of promoters and their regulatory elements is one of the
major challenges in bioinformatics and integrates comparative, structural, and
functional genomics. Many different approaches have been developed to detect
conserved motifs in a set of genes that are either coregulated or orthologous.
However, although recent approaches seem promising, in general, unambiguous
identification of regulatory elements is not straightforward. The delineation
of promoters is even harder, due to its complex nature, and in silico promoter
prediction is still in its infancy. Here, we review the different approaches
that have been developed for identifying promoters and their regulatory
elements. We discuss the detection of cis-acting regulatory elements using
word-counting or probabilistic methods (so-called "search by
signal" methods) and the delineation of promoters by considering both
sequence content and structural features ("search by content"
methods). As an example of search by content, we explored in greater detail
the association of promoters with CpG islands. However, due to differences in
sequence content, the parameters used to detect CpG islands in humans and
other vertebrates cannot be used for plants. Therefore, a preliminary attempt
was made to define parameters that could possibly define CpG and CpNpG islands
in Arabidopsis, by exploring the compositional landscape around the
transcriptional start site. To this end, a data set of more than 5,000 gene
sequences was built, including the promoter region, the 5'-untranslated
region, and the first introns and coding exons. Preliminary analysis shows
that promoter location based on the detection of potential CpG/CpNpG islands
in the Arabidopsis genome is not straightforward. Nevertheless, because the
landscape of CpG/CpNpG islands differs considerably between promoters and
introns on the one side and exons (whether coding or not) on the other, more
sophisticated approaches can probably be developed for the successful
detection of "putative" CpG and CpNpG islands in plants.
Article, publication date, and citation information can be found at
www.plantphysiol.org/cgi/doi/10.1104/pp.102.017715.
1 This work was supported by the Vlaams Instituut voor de Bevordering van het
Wetenschappelijk-Technologisch Onderzoek (grant no. STWW980396). K.F.
is indebted to the Instituut voor de aanmoediging van Innovatie door
Wetenschap en Technologie in Vlaanderen for a predoctoral fellowship, K.M. is
Research Fellow of the Fund for Scientific Research (Flanders), and P.R. is a
Research Director of the Institut National de la Recherche Agronomique
(France).
2 These authors contributed equally to the paper.
*
Corresponding author; e-mail
pierre.rouze{at}gengenp.rug.ac.be;
fax 3292645349.
Received November 14, 2002;
returned for revision January 10, 2003;
accepted March 17, 2003.
This article has been cited by other articles:

|
 |

|
 |
 
T. Abeel, Y. Saeys, E. Bonnet, P. Rouze, and Y. Van de Peer
Generic eukaryotic core promoter prediction using structural features of DNA
Genome Res.,
February 1, 2008;
18(2):
310 - 323.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. C. Mockler, T. P. Michael, H. D. Priest, R. Shen, C. M. Sullivan, S. A. Givan, C. McEntee, S. A. Kay, and J. Chory
The Diurnal Project: Diurnal and Circadian Expression Profiling, Model-based Pattern Matching, and Promoter Analysis
Cold Spring Harb Symp Quant Biol,
January 1, 2007;
72(0):
353 - 363.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S. De Bodt, G. Theissen, and Y. Van de Peer
Promoter Analysis of MADS-Box Genes in Eudicots Through Phylogenetic Footprinting
Mol. Biol. Evol.,
June 1, 2006;
23(6):
1293 - 1303.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
K. Florquin, Y. Saeys, S. Degroeve, P. Rouze, and Y. Van de Peer
Large-scale structural analysis of the core promoter in mammalian and plant genomes
Nucleic Acids Res.,
July 27, 2005;
33(13):
4255 - 4264.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. R. Baerson, A. Sanchez-Moreiras, N. Pedrol-Bonjoch, M. Schulz, I. A. Kagan, A. K. Agarwal, M. J. Reigosa, and S. O. Duke
Detoxification and Transcriptome Response in Arabidopsis Seedlings Exposed to the Allelochemical Benzoxazolin-2(3H)-one
J. Biol. Chem.,
June 10, 2005;
280(23):
21867 - 21881.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Barta, E. Sebestyen, T. B. Palfy, G. Toth, C. P. Ortutay, and L. Patthy
DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants
Nucleic Acids Res.,
January 1, 2005;
33(suppl_1):
D86 - D90.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. D. Buchanan, P. E. Klein, and J. E. Mullet
Phylogenetic Analysis of 5'-Noncoding Regions From the ABA-Responsive rab16/17 Gene Family of Sorghum, Maize and Rice Provides Insight Into the Composition, Organization and Function of cis-Regulatory Modules
Genetics,
November 1, 2004;
168(3):
1639 - 1654.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. M. Smith, D. C. Fulton, T. Chia, D. Thorneycroft, A. Chapple, H. Dunstan, C. Hylton, S. C. Zeeman, and A. M. Smith
Diurnal Changes in the Transcriptome Encoding Enzymes of Starch Metabolism Provide Evidence for Both Transcriptional and Posttranscriptional Regulation of Starch Metabolism in Arabidopsis Leaves
Plant Physiology,
September 1, 2004;
136(1):
2687 - 2699.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
T. Eulgem, V. J. Weigman, H.-S. Chang, J. M. McDowell, E. B. Holub, J. Glazebrook, T. Zhu, and J. L. Dangl
Gene Expression Signatures from Three Genetically Separable Resistance Gene Signaling Pathways for Downy Mildew Resistance
Plant Physiology,
June 1, 2004;
135(2):
1129 - 1144.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. McCormick
Control of Male Gametophyte Development
PLANT CELL,
June 1, 2004;
16(suppl_1):
S142 - S153.
[Full Text]
[PDF]
|
 |
|
|
|