Table I. Core hexamers enriched in gene body sequences and intergenic sequences

The hexamers found at >2σ abundance in OsMADS1-bound chromatin sequences (as in Fig. 5) were queried for match to consensus binding sequences for 92 known transcription factor classes in PLACE (http://www.dna.affrc.go.jp/PLACE/). We show the core hexamers (column 3), and their presence in the larger consensus binding sites of which they are a part are listed (column 2, underlined nucleotides). The sequences that contain the larger consensus motif, for the respective transcription factor, are given within parentheses in column 2. The transcription factor class that binds the consensus sequence is given below where it is known. The number of OsMADS1-bound sequences containing the core hexamer sequence and the CArG element is listed in column 5, while column 4 lists the numbers in all OsMADS1-bound data. Hexamers co-occurring with CArG motif, in statistically insignificant numbers, are indicated in boldface in column 5; these bound sites likely contain A-tracts.

cis-ElementConsensus Binding Sequence and Transcription Factor ClassCore Hexamer MotifNo. of Sequences with Core Hexamer MotifNo. of Sequences with CArG Element and Core Hexamer Motif
Core hexamers found in gene body sequences compared with the shuffled sequences
 GCrichrepeatIICGCCGCGC (41)CGCCGC274 (P < 0.0001)58 (P < 0.0001)
 AGCBOXNPGLBAGCCGCC (28) ERF-domainGCCGCC243 (P < 0.0001)54 (P < 0.0001)
 A-T rich binding motif for AP2R2 or GARETTTGTT/AACAAA (314) euAP2 with double Ap2 domainTTTGTT314 (P < 0.0001)246 (P < 0.0001)
 REGION1OSOSEMCGGCGGCCTCGCCACG bZIPCGGCGG/CTCGCC316 (P < 0.0001)63 (P = 0.6251)
 GLUEBOX2/OSGT3CTTTTGTGTACCTATTTTGT283 (P = 0.0001)233 (P < 0.0001)
 RGATAOSCAGAAGATA (2)AGAAGA192 (P < 0.0001)112 (P < 0.0001)
 GCrichrepeatIVGTCTCCCT (7)CTCCCT152 (P < 0.0001)79 (P = 0.0002)
 RYREPEATVFLE B4/ RYREPEAT4CATGCATG (21) B3 domainCATGCA150 (P < 0.0001)109 (P < 0.0001)
 PROLAMINBOXO SGLUB1TGCAAAG (37)TGCAAA175 (P < 0.0001)133 (P < 0.0001)
 TATAboxIVTATATACA (23)TATATA178 (P = 0.3517)132 (P = 0.0269)
 GCAAmotifSCAAAATGA (11)AAAATG209 (P < 0.0001)170 (P < 0.0001)
Core hexamers found in intergenic sequences compared with the shuffled sequences
 TATAboxIITATTTAAA (14)TTTAAA223 (P = 0.004)184 (P = 0.5465)
 GCrichrepeatIICGCCGCGC (8)CGCCGC61 (P = 0.0002)26 (P = 0.8415)
 A-T rich binding motif for AP2R2 or GARETTTGTT/AACAAA (183) euAP2 with double Ap2 domainAACAAA183 (P < 0.0001)145 (P < 0.0001)
 REGION1OSOSEMCGGCGGCCTCGCCACG bZIPGGCGGC62 (P < 0.0001)29 (P = 0.2109)
 RYREPEATVFLE B4/ RYREPEAT4CATGCATG (22) B3 domainCATGCA106 (P < 0.0001)68 (P < 0.0001)
 GLUEBOX2/OSGT3CTTTTGTGTACCTTATTTTGT182 (P < 0.0001)149 (P < 0.0043)
 DirectRepeatGGTTTTTAAGTTGTTTTT173 (P < 0.0001)138 (P = 0.0001)
 GCAAmotifSCAAAATGA (1)CAAAAT172 (P < 0.0001)147 (P < 0.0001)
 PROLAMINBOXCACATGTGTAAAGGTACATGT119 (P < 0.0001)96 (P < 0.0001)
 MYCATERD1CATGTG (109) Myc domainCATGTG109 (P < 0.0001)76 (P < 0.0001)
 GLUTEBP1OSAAGCAACACACAACACACAC77 (P < 0.0001)55 (P = 0.0003)
 Element II Os region Element II OsPCNA-2TGGGCCCGT Class II TCP domainTGGGCC74 (P < 0.0001)48 (P < 0.0001)
 PYRIMIDINEBOXOSRAMY1ACCTTTT (121)CCTTTT121 (P < 0.0001)99 (P < 0.0001)