Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis

.

During plant-pathogen interaction, the recognition of conserved molecules, so-called microbe-associated molecular patterns, by pattern recognition receptors leads to basal immunity via a signaling cascade.Basal immunity is associated with the up-regulation of pathogen response genes (Boller and Felix, 2009).Some pathogens suppress this immune reaction with effector molecules transferred to the plant cell (Grant et al., 2006).These effector molecules interfere at different steps in the signaling process.Some of these molecules, termed avirulence proteins, can be recognized by the plant cell via resistance proteins that induce effectortriggered immunity (Bent and Mackey, 2007).For the establishment of both basal and effector-triggered immunity, modification of gene expression in the nucleus takes place.Pathogen-responsive cis-regulatory elements in the promoters of genes play an important role in regulating these changes.
Several pathogen-responsive transcription factors (TFs) have been identified in recent years (Tena et al., 2011).Most prominently, the WRKY TFs are major regulators in plant-pathogen interactions (Rushton et al., 2010).Genes regulated by WRKY TFs harbor the W-box (Eulgem et al., 2000).W-boxes, for example, are abundant in genes up-regulated during systemic acquired resistance (Maleck et al., 2000).In Arabidopsis (Arabidopsis thaliana), WRKY22 and -29 were described as downstream targets of a flagellin-regulated mitogen-activated protein kinase (MPK) signal transduction pathway involving MPK3 and -6 (Asai et al., 2002).Later, WRKY33 was determined to be a direct target of MPK4, which is part of a different branch of the flagellin-activated mitogenactivated protein kinase pathway (Suarez-Rodriguez et al., 2007).MPK4 forms a nuclear complex with MKS (for mitogen-activated protein kinase substrate) and WRKY33.WRKY33 is released from this complex upon phosphorylation of MKS by MPK4 and activates the transcription of its target genes (Qiu et al., 2008).Also, WRKY18, -40, and -60 were shown to be involved in pathogen-regulated gene expression (Xu et al., 2006).WRKY TFs have been analyzed extensively and are also associated with functions that go beyond plant-pathogen interactions (Rushton et al., 2010).In addition to WRKY factors, many more TFs, such as members of the bZIP, MYB, and AP2/EREBP families, have been identified to be targets of pathogenregulated signaling pathways (Bethke et al., 2009;Pitzschke et al., 2009;Tena et al., 2011).
For only about one-half of the 60 TF families predicted in Arabidopsis (Pérez-Rodríguez et al., 2010), binding site specificities are known (Bülow et al., 2010).Based on this, there is ample room for databaseassisted data mining for novel cis-regulatory elements belonging either to TFs with unknown binding specificity or to previously undetected combinatorial elements.The identification of cis-regulatory sequences has been greatly facilitated using bioinformatic approaches and Web-queryable resources (Hehl and Figure 1.Bioinformatic identification and experimental analysis of conserved sequence motifs in pathogen-coregulated genes. Figure 2. Classification of sequence motif groups.A motif relationship tree generated by STAMP and illustrated with MEGA is shown.The number of motifs in each of the 37 motif families is given in parentheses.Similarities to TFBS were determined using STAMP.Motif groups showing high similarity to known cis-regulatory sequences are indicated by the designation of the corresponding TF family and the lowest E value.Wingender, 2001;Hehl et al., 2004;Hehl and Bülow, 2008;Brady and Provart, 2009;Priest et al., 2009;Usadel et al., 2009).One approach is the detection of conserved sequences within coregulated genes.This implies that such sequences are targeted by the same or closely related TFs.Such an analysis can be performed by submitting sets of genes to database-assisted analysis to identify known transcription factor binding sites (TFBS; Galuschka et al., 2007;Chang et al., 2008;Bülow et al., 2009).Another approach is the discovery of conserved sequence motifs in a set of coregulated genes regardless of whether these sequences have been associated with a function before.This involves the identification of overrepresented sequences in the upstream region of coregulated genes by using pattern-mining programs such as MEME, AlignACE, CONSENSUS, Co-Bind, BioProspector, and MITRA (Bailey and Elkan, 1995;Roth et al., 1998;Hertz and Stormo, 1999;GuhaThakurta and Stormo, 2001;Liu et al., 2001;Eskin and Pevzner, 2002).To further refine the output of motif-discovery programs, Bio-Optimizer (Jensen and Liu, 2004) was integrated with MEME, AlignACE, CONSENSUS, and BioProspector in BEST, the binding site estimation suite of tools (Che et al., 2005).
In the work presented here, BEST was applied for the discovery of conserved sequence motifs in promoters of Arabidopsis genes up-regulated by multiple pathogenic stimuli.These coregulated genes were identified using microarray expression data annotated to the PathoPlant database (Bülow et al., 2004(Bülow et al., , 2007)).Subsequently, the identified motifs were classified with STAMP (Mahony and Benos, 2007) and compared with known cis-regulatory sequences annotated to the AthaMap, PLACE, and AGRIS databases (Higo et al., 1999;Davuluri et al., 2003;Steffens et al., 2004;Bülow et al., 2006;Palaniswamy et al., 2006;Yilmaz et al., 2011).Sequences showing no or only low similarity to already described regulatory sequences were analyzed with synthetic promoters and the parsley (Petroselinum crispum) protoplast system to test for elicitor-responsive gene expression (Hahlbrock et al., 1995;Sprenger-Haussels and Weisshaar, 2000;Rushton et al., 2002).

RESULTS AND DISCUSSION
Identification of 37 Motif Families Harboring Novel and Known cis-Sequences in Pathogen-Responsive Arabidopsis Genes The goal of this work was the identification of novel pathogen-responsive cis-regulatory sequences from groups of genes simultaneously up-regulated by pathogenic stimuli.To achieve this goal, conserved sequences were identified in the promoters of these up-regulated genes.Figure 1 gives an overview of the work flow.In a first step, microarray experiments annotated to the PathoPlant database were employed to identify gene sets that are at least 2-fold up-regulated by one to six pathogen-related stimuli.In total, 732 queries were performed.Supplemental Table S1 lists the parameters for these queries and gives the number of induced genes obtained.A total of 711 of the 732 queries were performed with microarray data generated by treatment with Botrytis cinerea, Erysiphe orontii, or Phytophthora infestans, alone or in combination with other pathogenic stimuli.In total, 510 queries with one to five stimuli yielded coregulated genes.All queries with six stimuli did not yield coregulated genes.In cases where more than 120 genes were identified, the induction factor was increased to reduce the number of up-regulated genes to about 120 (Supplemental Table S1).This facilitates subsequent motif detection using BEST.
In a second step, conserved sequence motifs within the upstream region of the 510 coregulated gene groups were identified (Fig. 1).For this, the software package BEST was employed (Che et al., 2005).The package combines four different motif-finding programs (MEME, AlignACE, CONSENSUS, and Bio-Prospector).For motif detection, 1,000 bp upstream of all genes in a coinduced gene group were screened with this software.This analysis resulted in 443 sequence motifs.The alignment matrices of the 443 motifs are shown in Supplemental Data File S1.In some instances, the same sequence motifs are detected when different queries yield the same groups of upregulated genes (Supplemental Table S2).When these are subtracted, 407 different sequence motifs remain.These sequence motifs potentially represent conserved cis-sequences in pathogen up-regulated genes.
The 407 sequence motifs detected with BEST were further analyzed using STAMP (Fig. 1, step 3).STAMP not only determines the relationship between the motifs but also identifies their similarities to known cis-regulatory sequences (Mahony and Benos, 2007).To display the similarities among all 407 sequence motifs, a tree was generated with STAMP and illustrated by the program MEGA (Tamura et al., 2007).Originally, the tree branches into all 407 motifs, with branch length between nodes becoming shorter as similarities between adjacent motifs become higher.To facilitate the display of the tree and to generate motif groups for easier analysis, a cutoff value of 0.035, which is the minimum selected length of a branch between two nodes, was selected in MEGA.When this cutoff is applied, 39 motif groups are obtained.Figure 2 shows the motif group tree with branches to 39 motif groups, but only 37 motif groups are designated.This is due to motif groups 20 and 25, in which two very closely related groups were combined each.Therefore, groups 20 and 25 were generated with a higher cutoff value of 0.043 and 0.045, respectively.These 37 motif groups harbor two to 76 single motifs (Fig. 2).The motifs that belong to each of the 37 motif groups are identified in Supplemental Table S2.
To determine the similarity of the motifs among each other in more detail and to find similarities to known cis-regulatory sequences, all motifs within each of the 37 motif groups were submitted to STAMP to generate a so-called family binding profile (FBP).FBPs are constructed through multiple alignments of structurally related DNA-binding motifs (Sandelin and Wasserman, 2004).They can be displayed as a sequence logo showing the probability of the most frequent nucleotide(s) at each position (Crooks et al., 2004).All 37 FBPs were then queried with STAMP against the PLACE, AGRIS, and AthaMap databases that contain plant cis-regulatory sequences (Higo et al., 1999;Palaniswamy et al., 2006;Bülow et al., 2010).Supplemental Table S3 shows the result of this analysis and shows the most similar cis-regulatory sequences or TFBS obtained for each FBP with the three databases.The highest similarity is defined by the lowest E value.For cases where a high similarity (E value lower than 1e-05) to known TFBS was detected with all three databases, the results are shown in Figure 2.This reveals that groups 1, 3, and 7 have a high similarity to basic region/leucine zipper motif (bZIP) TFBS.Groups 18 and 20 have similarities to MYB and TCP TFBS, respectively, while groups 24 and 35 harbor similarity to ethylene-responsive element binding factor (ERF) and E2F TFBS, respectively.Interestingly, group 27 shows similarity to WRKY TFBS.WRKY TFs play a major role in pathogen-regulated gene expression (Rushton et al., 2010).WRKY TFBS, therefore, are expected to show up in this approach.Even more interesting is that many motif groups do not show high similarities to known plant cis-regulatory sequences (Fig. 2) and might contain so far unknown cis-sequences.The following sections describe the experimental analysis of selected cis-sequences from different motif groups (Fig. 1, step 4).Since one in three cis-sequences analyzed were elicitor responsive, no further bioinformatic analysis to exclude false positives was performed prior to the experimental analysis with the identified motifs.

A Modified Reporter Gene Plasmid with an Internal Transformation Control for Efficient Normalization of Gene Expression
To investigate whether the conserved sequence motifs harbor functional cis-sequences, the parsley protoplast system was used (Hahlbrock et al., 1995).In this system, parsley protoplasts are generated from a liquid callus suspension culture, transformed with reporter gene constructs, and subjected to an oligopeptide elicitor (Pep25) derived from a surface glycoprotein of the phytopathogenic fungus Phytophthora sojae (Nürnberger et al., 1994;Rushton et al., 1996).Subsequently, reporter gene activity is compared between Pep25-treated and untreated cells.
To test selected sequences, a reporter plasmid based on pBT10GUS harboring the GUS reporter gene (uidA) was used (Sprenger-Haussels and Weisshaar, 2000).To quantify expression strength, a second constitutively expressed reporter gene is usually employed on a cotransformed plasmid.To facilitate reporter gene assays, pBT10GUS was modified to contain a luciferase reporter gene under the control of a double 35S cauliflower mosaic virus (CaMV) promoter (pBT10GUS-d35SLUC).The luciferase gene serves as an internal control to normalize GUS expression driven by the cloned cis-sequences.This permits the comparison of independent experiments.
To test this vector, the D element previously shown to confer elicitor (Pep25)-responsive expression in parsley protoplasts was used (Rushton et al., 2002).The D-box is derived from the parsley PR2 gene promoter (van de Löcht et al., 1990;Kirsch et al., 2000;Rushton et al., 2002).Figure 3 shows a schematic representation of pBT10GUS-d35SLUC and normalized GUS reporter gene activity in parsley protoplasts with and without elicitor (Pep25).Reporter gene activity was measured for pBT10GUS-d35SLUC with a synthetic promoter containing four copies of the D-box (4D) cloned upstream of the 35S minimal promoter (TATA) and the GUS (uidA) reporter gene.Furthermore, a vector-only control was analyzed (pBT10GUS-d35SLUC).
All GUS values were normalized by the corresponding level of luciferase activity.This shows that the synthetic promoter harboring the D elements confers strong elicitor-responsive reporter gene activity in parsley protoplasts (Fig. 3B).The observation that the GUS reporter gene under the control of the minimal promoter (TATA box) in pBT10GUS-d35SLUC does not show significant reporter gene expression with or without Pep25 also demonstrates the lack of background expression in this modified vector.

Experimental Analysis of Conserved cis-Sequences
In total, 76 single cis-sequences from different potentially pathogen-responsive motifs that were identified as described above were tested for Pep25 responsiveness.Twenty-five of these sequences were found to be elicitor responsive in the parsley protoplast system when four copies of each sequence were cloned upstream of the 35S minimal promoter (TATA box) of the uidA reporter gene in pBT10GUS-d35SLUC (Fig. 3C).Table I lists all 25 elicitor-responsive sequences.The core sequence that was used by BEST for generating a motif is underlined.Furthermore, the Arabidopsis gene from which the sequence originates and the position (The Arabidopsis Information Resource [TAIR] 10 genome release) of the core sequence in motif orientation (first nucleotide) upstream to the gene is identified.Also, the motif group to which the core sequence belongs is listed.The sequence identifier (i.e.12i) permits the identification of the database query in Supplemental Table S1, the identification of the respective motif (12i_M1) in Supplemental Data File S1, and the number of the experimentally tested sequence (12i_M1_S1; Table I).Table II lists I) driving GUS expression.Plasmids were transformed into parsley protoplasts and treated without (2) and with (+) Pep25 elicitor.B, Alignment of elicitor-responsive sequences 18 through 24.Underlined are W-box core sequences.A core sequence determined from the alignment is shaded.C, FBP generated by STAMP with all 26 motifs in motif group 27 (Fig. 2).[See online article for color version of this figure.]functionality of sequences from motif group 27, several sequences from this group were tested experimentally.Seven sequences turned out to be elicitor responsive.Figure 4A shows the result of the transient reporter gene assays with sequences 18 through 24 from group 27 (Table I).The tested sequences are derived from five different sets of up-regulated genes.Gene set 21S contains seven genes up-regulated by P. infestans 12 h post infection (hpi) and the salicylic acid analog benzo[1,2,3] thiadiazole-7-carbothioic acid-S-methyl ester (BTH), 30I-8 contains 52 genes up-regulated by E. orontii 5 d post infection and the oomycete-derived elicitor NPP1, 27D-10 contains nine genes up-regulated by E. orontii (24 hpi) and NPP1, 14S contains nine genes up-regulated by B. cinerea (18 hpi) and BTH, and 30H-8 contains 27 genes up-regulated by E. orontii (4 d post infection) and NPP1 (Supplemental Table S1).Compared with the positive control 4D, the sequences do not yield the same level of Pep25-responsive reporter gene activity (Table II), but the values are clearly above those obtained for the empty vector without integrated cis-sequences (Fig. 4A).Although the motif group 27 harbors similarity with WRKY binding sites, this similarity is not obvious when sequences that confer elicitor-responsive gene expression are aligned.Figure 4B shows an alignment of the seven experimentally positive sequences, which reveals a novel core sequence, GACTTTT.Only two sequences deviate slightly from this core sequence (23 and 24).For alignment, the reverse complement of sequences 18, 19, 21, 22, and 23 were used.Two of the sequences come from the promoter of the same gene (18 and 19; Table I) that was up-regulated in two different queries (21S and 14S).The core sequence of a WRKY TFBS, the W-box TTGAC(C/T) (Rushton et al., 2010), is only found in three of the sequences (21, 22, and 24; Fig. 4B).Sequences harboring this core sequence are potentially bound by several Arabidopsis WRKY TFs (de Pater et al., 1996;Du and Chen, 2000;Yu et al., 2001;Chen and Chen, 2002;Robatzek and Somssich, 2002;Xu et al., 2006;Ciolkowski et al., 2008;Kim et al., 2008).Because the other sequences lack the conserved WRKY core binding site, these elicitorresponsive sequences from group 27 may be bound either by other WRKY TFs or by TFs not previously associated with a pathogen response.In this instance, it is interesting that NtWRKY12 binds to the sequence TTTTCCAC, which significantly deviates from the classical W-box (van Verk et al., 2008).Other WRKY binding sites that deviate from the W-box are the sugarresponsive element and the PRE4 element (Rushton et al., 2010).
The FBP determined for motif family 27 resembles a motif identified in promoters of genes up-regulated in plants exposed to methyl viologen, a superoxide propagator in the light (Scarpeci et al., 2008).Interestingly, motif sequence 22 from gene At5g24110 (Table I), which codes for AtWRKY30, also contributes to the previously described motif (Scarpeci et al., 2008).While promoter-reporter gene studies indicate that WRKY30 is induced by reactive oxygen species Figure 5. Elicitor-responsive sequences from motif group 11.A, Quantitative GUS reporter gene assays using pBT10GUS-d35SLUC (NC) and the same plasmid containing synthetic promoters 4D (PC) or four copies of single sequences 5 through 8 (Table I) driving GUS expression.Plasmids were transformed into parsley protoplasts and treated without (2) and with (+) Pep25 elicitor.B, Alignment of elicitor-responsive sequences 5 through 8. Underlined are W-box core sequences.A core sequence determined from the alignment is shaded.C, FBP generated by STAMP with all 10 motifs in motif group 11 (Fig. 2).[See online article for color version of this figure.](Scarpeci et al., 2008), the work presented here shows that sequence 22 from the WRKY30 promoter confers elicitor-responsive gene expression in parsley protoplasts (Fig. 4A).
In a similar study, novel cis-regulatory elements conserved in coexpressed genes from Arabidopsis were predicted recently (Zou et al., 2011).Most interestingly, the novel core sequence GACTTTT determined by aligning the Pep25-responsive sequences from motif group 27 (Fig. 4B) was also detected in that study.Screening Supplemental Data File 2 from Zou et al. (2011) revealed that this sequence was identified either as a stand-alone putative cis-regulatory element or as part of a larger element.For example, the reverse complement sequence AAAAGTC was enriched in promoters of genes responsive to flagellin 22, NPP1, and P. infestans (Zou et al., 2011).Also, the Pep25responsive sequences determined in our study (Fig. 4) were derived from microarray experiments involving NPP1 and P. infestans, supporting the notion that this sequence is either functional alone or part of a larger or combinatorial cis-regulatory element.
Elicitor-Responsive cis-Sequences from Motif Groups 11 and 12 Harbor MYB-Like cis-Regulatory Sequences Many of the elicitor-responsive sequences belong to motif groups that are not highly similar to known TFBS (Fig. 2).Figures 5A and 6A show the results of the transient reporter gene assays with sequences 5 through 8 and 9 through 12, respectively.These sequences belong to motif groups 11 and 12. Compared with the positive control 4D, the sequences do not achieve the same level of reporter gene activity (Table II), but values are higher than those obtained for the plasmid without integrated cis-sequences (Figs.5A  and 6A).Thus, the sequences clearly mediate Pep25responsive reporter gene activity.
The tested sequences from motif group 11 are derived from four different groups of up-regulated genes (Table I).15AAA contains four genes up-regulated by B. cinerea (48 hpi) and chitin (10 min), 22A contains five genes up-regulated by P. infestans (all) and chitin (10 min), 22DDD contains nine genes up-regulated by P. infestans (12 hpi) and chitin (3 h), and 21G-2 contains 118 genes up-regulated by P. infestans (6 hpi) and salicylic acid (Supplemental Table S1).Two of the sequences came from the same gene (5 and 6; Table I) and were found to be up-regulated in two different queries (15AAA and 22A).Figure 5B shows an alignment of the four sequences tested.For the alignment, the reverse complement of sequences 5, 6, and 7 are shown.Based on the alignment, a new core sequence, TGGTTT, was identified.This core sequence can also be seen in the FBP between positions 3 and 8, but it is not highly conserved in all motifs (Fig. 5C).
Figure 6.Elicitor-responsive sequences from motif group 12. A, Quantitative GUS reporter gene assays using pBT10GUS-d35SLUC (NC) and the same plasmid containing synthetic promoters 4D (PC) or four copies of single sequences 9 through 12 (Table I) driving GUS expression.Plasmids were transformed into parsley protoplasts and treated without (2) and with (+) Pep25 elicitor.B, Alignment of elicitor-responsive sequences 9 through 12. Underlined is a W-box core sequence.A core sequence determined from the alignment is shaded.C, FBP generated by STAMP with all 21 motifs in motif group 12 (Fig.

2). [See online article for color version of this figure.]
The tested sequences from motif group 12 are derived from three different groups of up-regulated genes (Table I).18H contains 21 genes up-regulated by B. cinerea (48 hpi) and NPP1 (1 h), 38M contains 115 genes up-regulated by E. orontii (all) and Pseudomonas syringae pv maculicola (avr+), and 26LLL contains 55 genes up-regulated by E. orontii (all) and NPP1 (1 and 4 h; Supplemental Table S1).Figure 6B shows an alignment of the four sequences tested.For the alignment, the reverse complement of sequences 10, 11, and 12 are shown.Based on the alignment, a new core sequence, GTTT, was identified that is not seen as highly conserved in the FBP (Fig. 6C).
The two core sequences derived from Pep25-responsive sequences of motif groups 11 (TGGTTT) and 12 (GTTT) illustrate their close relationship in the motif tree (Fig. 2).Interestingly, the sequence TGGTTT or its reverse complement was not detected when cisregulatory elements recently predicted in a similar study were screened (Zou et al., 2011, Supplemental Data File 2).The two new core sequences could indicate that these may be bound by members of the MYB TF family.Interestingly, AtMYB2 binds to the sequence TGGTTT found in the Arabidopsis Adh1 promoter (Hoeren et al., 1998).AtMYB2 belongs to subgroup 20 of the R2R3-MYB family and is implicated in anaerobiosis, drought, salt, and abscisic acidregulated gene expression (Abe et al., 1997(Abe et al., , 2003;;Hoeren et al., 1998;Dubos et al., 2010).DNA binding of AtMYB2 may be redox regulated (Serpa et al., 2007), and recently, AtMYB2 was shown to function in plant senescence (Guo and Gan, 2011).Other Arabidopsis MYB factors involved in the response to microbes, insects, or other pathogens are AtMYB72, AtMYB102, and AtMYB108 (Dubos et al., 2010).AtMYB72 is activated in roots upon colonization by nonpathogenic Pseudomonas fluorescens WCS417r.Knockout mutants of AtMYB72 are incapable of mounting induced systemic resistance against the pathogens P. syringae pv tomato, Hyaloperonospora parasitica, Alternaria brassicicola, and B. cinerea (Van der Ent et al., 2008).AtMYB102 is expressed locally in leaves at the feeding sites of the insect herbivore Pieris rapae.Knockout AtMYB102 mutant plants allowed a faster development of P. rapae caterpillars, indicating that AtMYB102 contributes to basal resistance against P. rapae feeding (De Vos et al., 2006).A BOS1 (AtMYB108) mutant showed increased susceptibility to B. cinerea infection.BOS1 is also required to restrict the spread of A. brassicicola (Mengiste et al., 2003).
Plant MYB factors show some variability in their binding sites and often regulate gene expression by interacting with members of the basic helix-loop-helix (bHLH) TF family (Hoeren et al., 1998;Steffens et al., 2005;Feller et al., 2011).While the FBPs of groups 11 and 12 do not show the bHLH binding site consensus sequence, sequence 9 from motif group 12 has a bHLH consensus sequence, CANNTG (Hartmann et al., 2005), between the putative MYB binding site TGGTTT and a W-box (Fig. 6B).In this respect, it is interesting that sequence 9 was obtained from the promoter of the Figure 7. Elicitor-responsive sequences from seven motif groups.A, Quantitative GUS reporter gene assays using pBT10GUS-d35SLUC (NC) and the same plasmid containing synthetic promoters 4D (PC) or four copies of single sequences 1 through 4, 13 through 17, and 25 (Table I) driving GUS expression.Plasmids were transformed into parsley protoplasts and treated without (2) and with (+) Pep25 elicitor.B, Single sequences from motif groups 1 (sequence 1), 5 (2-4), 18 (13), 19 (14), 21 (15, 16), 24 (17), and 32 (25; Table I).Elicitor-responsive sequences 2, 3, and 4 from motif group 5 and sequences 15 and 16 from group 21 are aligned.Underlined is a W-box core sequence.A core sequence determined from the alignment of sequences 2 and 3 is shaded.[See online article for color version of this figure.]AtWRKY53 gene (At4g23810; Table I).AtWRKY53 is a senescence-associated TF that can bind to its own promoter (Miao et al., 2004).In fact, a W-box core sequence is found in sequence 9 (Fig. 6B).Also, sequences 5 and 6 have a W-box core sequence (Fig. 5B).Another W-box core sequence is generated by the SpeI site in the linker used for cloning of sequence 11 (data not shown).Despite the possible MYB factor binding, this could also indicate the involvement of WRKY factors in the inducibility of sequences 5, 6, 9, and 11.Although core sequences that might play a role in elicitor responsiveness were identified in this study, adjacent sequences may also have a profound effect on the binding of a specific TF.
Figure 7 and Table II show the results of the transient reporter gene assays with sequences 1 to 4, 13 to 17, and 25.These sequences are derived from nine queries for up-regulated genes (Table I).The number of upregulated genes in each query and the pathogenic stimuli used in the queries can be found under the query identifiers in Supplemental Table S1.In each of the queries, B. cinerea, E. orontii, or P. infestans was combined with NPP1, BTH, salicylic acid, chitin, or methyl jasmonate.All sequences confer induced reporter gene activity upon treatment with the elicitor Pep25.Sequences 2 and 15 show the highest Pep25responsive GUS activity of all investigated constructs, while sequence 2 also has a relatively high background activity (Table II). Figure 7B shows an alignment of sequences 2, 3, and 4 from motif group 5 and sequences 15 and 16 from motif group 21 as well as all other single sequences tested.It is remarkable that sequence 3, which harbors only one nucleotide difference in the conserved core sequence (TACGTGACG) compared with sequence 2 (TACGTCACG), shows the  I and II.
weakest Pep25-responsive gene expression (Table II; Fig. 7B).This indicates that either additional sequences adjacent to, or the single nucleotide change within, the core sequence influence the strength of gene expression.The alignment of sequence 4 with sequences 2 and 3 from motif group 5 and sequences 15 and 16 from motif group 21 did not yield a readily identifiable core sequence.Motif groups 1, 3, and 7 show similarities to bZIP TFBS, and group 5 is related to these groups (Fig. 2).However, only sequences 2 and 3 from group 5 show similarity to bZIP binding sites (ACGTG) in their core sequence (Fig. 7B).Motif groups 19 and 32, to which the sequences 14 and 25 belong, do not show a strong similarity to known TFBS.In contrast, motif groups 18 and 24, to which the sequences 13 and 17 belong, show similarity to MYB and ERF TFBS, respectively (Fig. 2).In this context, it should be mentioned that sequence 16 also harbors a core WRKY binding site (Fig. 7B) and that the SpeI linker ligated to sequence 25 generates a core WRKY binding site (data not shown).

Experimental Analysis of cis-Sequences in Nicotiana benthamiana
To investigate the functionality of the elicitor-responsive cis-sequences in another system, N. benthamiana leaves were injected with Agrobacterium tumefaciens harboring promoter-reporter gene constructs.For this, 19 synthetic promoters were cloned into a T-DNA vector with a GUS reporter gene harboring an intron to avoid expression in A. tumefaciens.The results of GUS staining after injection of A. tumefaciens harboring recombinant T-DNAs are shown in Figure 8.In all cases, except in sequence 21, GUS staining was observed.For sequence 9, the GUS staining was very weak.The results from this additional experimental system indicate that most of the cis-sequences also confer reporter gene expression during A. tumefaciens infection.

CONCLUSION AND PERSPECTIVE
Synthetic promoters play an important role for the regulation of pathogen-responsive genes in biotechnological applications (Rushton et al., 2002;Venter, 2007).Such approaches may be the specific expression of genes conferring resistance in response to pathogen attack.For this and other strategies, pathogen-inducible promoters might be the most useful, as they limit the cost of resistance by restricting expression to infection sites (Gurr and Rushton, 2005).To increase the number of available cis-regulatory sequences for the design of synthetic promoters, a bioinformatic approach combining several different databases, software tools, and Web servers was employed.With these, a large number of known and novel sequence motifs conserved in pathogen-coregulated genes were identified.Experimental analysis revealed that these sequence motifs are a useful resource to identify novel elicitor-responsive sequences.These sequences can be useful for the design of synthetic promoters and may uncover, through further experimentation, novel TFs involved in plantpathogen interactions.The approach presented here may be applicable to identify novel cis-responsive sequences involved in any stress-response reaction.

Bioinformatic Analysis
For the identification of Arabidopsis (Arabidopsis thaliana) genes upregulated by pathogenic stimuli, microarray expression data were downloaded from TAIR (Reiser and Rhee, 2005) and annotated to the PathoPlant database (Bülow et al., 2004).The annotation procedure of complementary DNA microarray data has been described earlier (Bülow et al., 2007).In the case of Affymetrix ATH1 microarray expression sets, for each experiment, several replica slides for control and treated plants exist.For every data set (one control and one treated plant), the "array element" (probe set name), the signals, the signal detections, and the P values were extracted.A Perl script was used to generate an induction factor from the signals of each data set if the detection was at least marginal (GeneChip Expression Analysis, Data Analysis Fundamentals; Affymetrix).If the induction factor was less than 1, the negative reciprocal value was calculated.The processed data sets (table "rawdataaffy") were imported into the PathoPlant database.For gene identification, a text file was downloaded from the TAIR microarray resource (Garcia-Hernandez et al., 2002).This text file includes the information of locus/gene assignments to single array elements based on BLASTn-related genome mapping of the Affymetrix ATH1 array.A Perl script was used to extract the data, and multiple locus assignments were tagged.The processed data and generated table "assignments" with array element, gene identification, and tag were imported into the PathoPlant database.The two tables rawdataaffy and assignments were joined.In order to merge records representing replicate hybridizations, geometric mean values and base-10 logarithms of the SD of individual induction factors were determined using a Microsoft Visual Basic script and stored in table "mean." All data as well as links to the microarray source of the expression set can be found on the PathoPlant site at http://www.pathoplant.de/documentation.php.
Expression data were used to identify genes coregulated upon different pathogen-related stimuli.Genes up-regulated more than 2-fold upon mainly fungal elicitation were determined using a SQL server query tool.A total of 732 stimulus combinations were queried, including searches for highly induced genes (i.e.genes showing no background transcription [absent in detection call] in untreated plants within two replicates).
PathoPlant's Microarray Expression online tool displays a similar functionality to the query tool described above and can be used to determine sets of genes coregulated upon pathogen-related stimuli.
For promoter analyses, sequences 1,000 nucleotides upstream of the transcription start site of these genes were extracted (TAIR release 6) and converted into FASTA format.To identify overrepresented motif sequences within these promoters, the BEST software package (Che et al., 2005) was locally installed on a Linux SuSE9.2 system.
The package combines four different motif-finding programs (MEME, AlignACE, CONSENSUS, BioProspector) and an optimization step.BEST was run with default parameters and predefined motif lengths of five to 10, 10 to 15, and 15 to 20 nucleotides.Overrepresented sequence motifs identified by BEST were further used if detected by at least two out of the four motif-finding programs.These identified sequence motifs were submitted to the STAMP Web server as TRANSFAC-compatible matrices (Matys et al., 2003;Mahony and Benos, 2007).STAMP classified all motifs based on matrix alignment to a similarity tree given in newick format that was displayed using MEGA (Tamura et al., 2007).Groups containing similar motifs were generally defined by clustering single motifs on branches with lengths less than 0.035.STAMP was further used for generating FBPs for all motifs of each group by multiple alignments.STAMP was also employed for the identification of motif similarities by comparison with known cis-elements from plant databases Atha-Map, AGRIS, and PLACE (Higo et al., 1999;Davuluri et al., 2003;Steffens et al., 2004).

Plasmid Constructs
For all recombinant DNA work, standard protocols were employed (Sambrook and Russell, 2001).For transient protoplast transformation, a plasmid containing a d35S::LUC cassette and a minimal promoter in front of a GUS reporter gene was generated as follows.The double 35S CaMV promoter (d35S) was recovered with HindIII/XhoI digestion from plasmid p70S-ruc (Stahl et al., 2004) and introduced into HindIII/XhoI-digested pBT10-LUC plasmid (Sprenger-Haussels and Weisshaar, 2000) to give pBT10-d35SLUC.The SacI site in pBT10-d35SLUC was removed by digesting, filling in of sticky ends, and religation.The d35S::LUC cassette was recovered by BmtI/HindIII digestion in which the BmtI site was filled in with T4 DNA polymerase.This fragment was introduced into BamHI/HindIII-digested pBT10-GUS (Sprenger-Haussels and Weisshaar, 2000), in which the BamHI site was also filled in with T4 DNA polymerase prior to cloning.The resulting plasmid was designated pBT10GUS-d35SLUC.The integrity of the plasmid was confirmed by partial sequence and restriction analysis.
All oligonucleotides (Table I) were synthesized with partial SpeI and XbaI sites by Eurofins MWG Operon, annealed, and ligated into the SpeI/XbaI sites of pBT10-GUS.Multimerization to tetramers was performed as described previously (Sprenger-Haussels and Weisshaar, 2000;Rushton et al., 2002).The tetramer sequence was removed by SpeI/XbaI digestion and cloned in the SpeI/XbaI sites of pBT10GUS-d35SLUC.Plasmid DNA for protoplast transformation was isolated with the NucleoBond xtra midi EF kit (Macherey-Nagel) as described by the manufacturer.
For transient promoter studies in Nicotiana benthamiana, the synthetic promoter sequences were cloned into the T-DNA vector pBINGUSintron-sbA-VirG726-pm (pBINGUSintron; D. Stahl, unpublished data).This vector, a derivative of pBI121 (Jefferson et al., 1987), harbors a 35S CaMV promoterintron-GUS expression cassette.The complete synthetic promoters containing tetramers of the cis-sequences and the 35S minimal promoter were amplified by PCR from recombinant pBT10-GUS plasmids using primers with HindIII and BamHI sites, respectively (P1, 59-TAGCAAGCTTGAATTCGGCGCG-CCACTAGT-39; P2, 59-TTGGATCCGGTGGCCACTCGAGCGTG-39; 55°C annealing temperature).The CaMV 35S promoter in front of the intron-containing GUS gene in pBINGUSintron was removed by HindIII/BamHI digestion and replaced with HindIII/BamHI-digested PCR products.Using the same strategy, the minimal promoter from empty pBT10-GUS was introduced into pBINGUSintron to give a negative control for infiltration experiments.Constructs were confirmed by sequencing and transformed into an Agrobacterium tumefaciens C58C1 strain for agroinfiltration experiments.
To isolate parsley protoplasts, a 5-d-old dark-grown parsley callus culture was used.The culture was spun down for 5 min at room temperature and 300g.The cell pellet was resuspended in 30 mL of enzyme solution (0.08% macerozyme, 0.5% cellulase in 0.24 M CaCl 2 ) and filled up to 90 mL with 0.24 M CaCl 2 .Cells were shaken for 20 h at 20 rpm and 23°C and subsequently for 20 min at 40 to 45 rpm at the same temperature in the dark.The suspension was divided into two 50-mL tubes and spun down for 2 min at 300g.Cell pellets were washed with 20 mL of 0.24 M CaCl 2 (2 min, 300g) each, resuspended in 25 mL of P5 medium (Gamborg's B5 supplemented with 0.28 M Suc and 1 mg L 21 2,4-dichlorophenoxyacetic acid, pH 5.7) each, and combined.After centrifugation (5 min, 300g), the intact protoplasts float on the surface of the medium.These protoplasts where removed and filled up with P5 medium.This flotation procedure was repeated three times.Obtained protoplasts were used for transformation.
For transformation, 10 mg of plasmid DNA was mixed with 200 mL of polyethylene glycol solution [25% polyethylene glycol 6000, 450 mM mannitol, and 100 mM Ca(NO 3 ) 2 ] in a 15-mL tube.A total of 200 mL of protoplasts was added and mixed slightly.After incubation for 20 min in the dark, 5 mL of 275 mM Ca(NO 3 ) 2 supplemented with 2 mM MES (pH 6.0) was used to stop the transformation.Protoplasts were pelleted by centrifugation at 150g for 7 min, and the cell pellet was resuspended in 6 mL of P5.Half of the suspension was transferred in a new 15-mL tube.Pep25 was added to one of the tubes (end concentration, 300 ng mL 21 ).Pep25 elicitor (peptide, 59-DVTAGAEV-WNQPVRGFKVYEQTEMT-39) from Phytophthora sojae (Nürnberger et al., 1994) was synthesized by SeqLab.Protoplasts were cultured in the dark at 23°C for 24 h.After this cultivation time, protoplasts were harvested by the addition of 9 mL of 0.24 M CaCl 2 and centrifugation for 10 min at 1,400g.Cell pellets were frozen in liquid nitrogen and stored at 280°C until GUS/LUC analysis.
A total of 150 mL of LUC extraction buffer (0.1 M NaH 2 PO 4 , pH 7.8, supplemented with 1 mM dithiothreitol) was added to the frozen samples.After shaking (mixer 5432; Eppendorf) for 20 min at 4°C, extracts were centrifuged for 10 min at 4°C and 25,000g.The supernatant was kept on ice and was used for protein quantification and GUS/LUC assays.Total protein was determined according to Bradford (1976), and a defined amount of 4 mg of protein in 50 mL of LUC extraction buffer was used for LUC assays.LUC assays were prepared according to de Wet et al. (1987).Diluted samples were put into the wells of white 96-well microtiter plates, and the plates were inserted into a TriStar LB 941 microplate reader (Berthold Technologies).A total of 50 mL of luciferin (0.2 mM in glycylglycin, pH 7.8) and 175 mL of LUC reaction buffer (15 mM MgSO 4 , 25 mM glycylglycin, pH 7.8, and 5 mM ATP) were added by the TriStar, and the luminescence was measured for 15 s.For GUS analysis (Jefferson et al., 1987), 25 mL of the diluted protein extract (2 mg) was transferred into a well of a black 96-well microtiter plate.A total of 225 mL of GUS reaction buffer (50 mM NaPO 4 , pH 7.0, 10 mM Na 2 EDTA, 0.1% Triton X-100, 0.1% N-lauryl sarcosine, 10 mM b-mercaptoethanol, and 1 mM 4-methylumbelliferyl-b-D-glucuronide) was added, and the plate was inserted into the TriStar and incubated at 37°C for 10 min prior to measurements at 37°C.For continuous measurement of GUS activity, the samples in each well were then measured every 15 min for 1 s over the next 3 h (excitation, 360 nm; emission, 460 nm).Afterward, for each well, a linear regression over the time period with a linear increase of fluorescence was performed.Nonlinear parts were excluded from the regression.The slope of the regression line was then transformed into pmol 4-MU min 21 mg 21 protein.For this, a calibration of fluorescence units with defined amounts of 4-MU was performed in the TriStar.A linear increase of fluorescence units with 4-MU concentration has been observed up to at least 75 mM.This linear correlation was then used for the transformation mentioned above.
For each synthetic promoter, at least three independent experiments with two transformations each were carried out.To obtain comparable results that are independent from transformation efficiencies, all GUS values were normalized with the help of their corresponding LUC values.Only LUC values obtained without elicitor were used for normalization, because the same transformation gives lower LUC values after elicitor treatment, although the transformation efficiency is the same.For normalization of the GUS values, one LUC value (without Pep25 elicitor) was selected and all other LUC values without elicitor were divided by this selected LUC value.The obtained quotients were used to divide corresponding GUS values with and without elicitor.SD values were calculated from these normalized GUS values.GUS values and SD values from controls (TATA and 4D) were calculated from all performed experiments.

Agroinfiltration of N. benthamiana Leaves
The inoculum for infiltration experiments was prepared as described (Wroblewski et al., 2005) with minor changes.Agrobacteria harboring the promoter-GUS fusions in pBINGUSintron (see above) were grown on Luria-Bertani (LB) plates with rifampicin (50 mg mL 21 ), kanamycin (50 mg mL 21 ), and carbenicillin (100 mg mL 21 ) at 25°C for 48 h.A single colony was used to inoculate 5 mL of LB broth with the same antibiotics as above.The starter culture was grown for 24 h at 25°C, added to 45 mL of fresh LB broth (with antibiotics; see above), and cultivated for a further 5 to 6 h.Cells were harvested by centrifugation at 3,000g for 10 min and resuspended in deionized water to a final optical density at 600 nm of 0.4 to 0.5.
The N. benthamiana plants used for the infiltration experiments were either soil grown in the greenhouse (during summer) or in a 25°C light chamber (16 h of light/8 h of dark, during winter).The bacterial suspension was infiltrated into the abaxial side of fully expanded N. benthamiana leaves using a needleless 1-mL syringe.For each experiment, the positive control (pBINGUSintron with 35S-GUS), the negative control (TATA-GUS in pBINGUSintron), and the construct under investigation were infiltrated in different areas of the same leaf.A second leaf from a different plant was infiltrated in the same way as a technical replicate.After infiltration, the plants were kept in the greenhouse for 3 d.

Figure 3 .
Figure3.The pBT10GUS-d35SLUC vector for transient reporter gene assays.A, Schematic representation of pBT10GUS-d35SLUC with relevant restriction sites and positions of the GUS (uidA) and LUC reporter genes.Also shown are the double CaMV 35S promoter (d35S), CaMV 35S minimal promoter (TATA), nopaline synthase terminator (Tnos), origin of replication (ColE1), and ampicillin resistance gene (bla).B, Quantitative GUS reporter gene assays using pBT10GUS-d35SLUC alone and the same vector containing synthetic promoter 4D driving GUS expression.Plasmids were transformed into parsley protoplasts and treated without (2) and with (+) Pep25 elicitor.All normalized GUS values are shown in TableII.C, Schematic representation of four copies of cis-sequences cloned upstream of the 35S minimal promoter (TATA) and the GUS (uidA) reporter gene in pBT10GUS-d35SLUC.[See online article for color version of this figure.] the normalized GUS values (pmol 4-methylumbelliferone [4-MU] min 21 mg 21 protein) and SD obtained for all reporter gene constructs in the parsley protoplast system without (2) and with (+) Pep25 treatment.The 25 Pep25-responsive sequences are from 21 different motifs, belong to 10 different motif groups, and are derived from 22 different Arabidopsis genes.Elicitor-Responsive cis-Sequences from the W-Box-Related Motif Group 27 Define a Novel cis-Regulatory Sequence Twenty-six of the 407 identified motifs belong to motif group 27, which contains WRKY binding site similarity.The W-box similarity of group 27 originates from the conserved W-box core sequence TTGAC(C/T) in the FBP shown in Figure 4C(Rushton et al., 2010).However, not all sequences that generate this motif family have the conserved W-box core sequence.To test the

Figure 4 .
Figure4.Elicitor-responsive sequences from motif group 27.A, Quantitative GUS reporter gene assays using pBT10GUS-d35SLUC (NC) and the same plasmid containing synthetic promoters 4D (PC) or four copies of single sequences 18 through 24 (TableI) driving GUS expression.Plasmids were transformed into parsley protoplasts and treated without (2) and with (+) Pep25 elicitor.B, Alignment of elicitor-responsive sequences 18 through 24.Underlined are W-box core sequences.A core sequence determined from the alignment is shaded.C, FBP generated by STAMP with all 26 motifs in motif group 27 (Fig.2).[See online article for color version of this figure.]

Figure 8 .
Figure 8. Transient reporter gene activity of elicitorresponsive sequences on T-DNA constructs infiltrated with A. tumefaciens in N. benthamiana.PC and NC are reporter gene constructs with a 35S CaMV and a minimal promoter, respectively.The numbers correspond to the sequence numbers shown in Tables I and II.

Table II .
Normalized GUS values (pmol 4-MU min 21 mg 21 protein) and SD obtained for reporter gene constructs in a parsley protoplast system without (2) and with (+) Pep25 treatment