Recurrent Deletions of Puroindoline Genes at the Grain Hardness Locus in Four Independent Lineages of Polyploid Wheat

Polyploidy is known to induce numerous genetic and epigenetic changes but little is known about their physiological bases. In wheat, grain texture is mainly determined by the hardness ( Ha ) locus consisting of genes Pina and Pinb . These genes are conserved in diploid progenitors but were deleted from the A and B genomes of tetraploid Triticum turgidum (AB). We now report the recurrent deletions of Pina-Pinb in other lineages of polyploid wheat. We analyzed the Ha haplotype structure in 90 diploid and 300 polyploid accessions of Triticum and Aegilops species. Pin genes were conserved in all diploid species and deletion haplotypes were detected in all polyploid Triticum and most of the polyploid Aegilops species. Two Pina-Pinb deletion haplotypes were found in hexaploid T. aestivum (ABD). Pina and Pinb were eliminated from the G genome, but maintained in the A genome of tetraploid T. timopheevii (AG). Subsequently, Pina and Pinb were deleted from the A genome but retained in the A m genome of hexaploid T. Comparison of deletion breakpoints demonstrated that the Pina-Pinb deletion occurred independently and recurrently in the four polyploid wheat species. The implications of Pina-Pinb deletions for polyploid-driven evolution of gene and genome and its possible physiological significance are discussed.


INTRODUCTION
screened a BAC library of Ae. speltoides (Akhunov et al. 2005) with Gsp and Pina probes, each of which identified three BACs. Gsp-containing BACs did not overlap with Pina-containing BACs. We estimate that the Ha-S genomic region in Ae. speltoides is three times the size of Ha-A m in T. monococcum and of Ha-D in Ae. tauschii and five times the size of Ha-D of T. aestivum. A Pina BAC, 197O23, was shotgun-sequenced at 8x coverage and assembled into 13 contigs after prefinishing, totaling 212,510 bp. Four non-transposable element (TE) protein-coding genes, Pina, Pinb and two ATPase, were found in this BAC, are located in the contig at the 3' end in the same orientation and span 28,848 bp (Fig. 2). Gene5, previously reported to be present in the collinear region between Pina and Pinb of T. monococcum and Ae. tauschii, was not found in Ae.
speltoides. Based on sequence homology and colinearity, the two ATPase genes in BAC 197O23 correspond to ATPase-4 and ATPase-5 at the Ha-D locus and are orthologous to two truncated ATPase genes upstream of Gene8 at the Ha-B locus (Chantret et al. 2005).
The rest of BAC 197O23 is gene-free and mainly occupied by TEs and tandem repeats, a typical feature of large genomes, where genes are clustered into islands and separated by nested TEs (Wicker et al. 2001). Gene8, ATPase-1, ATPase-2 and ATPase-3 were located in a separate BAC.

Survey of the haplotype structure at the Ha locus in Aegilops and Triticum
We determined the haplotype structure at the Ha locus by Southern analysis of tester DNA digested with restriction enzymes EcoRI, HindIII or BamHI using Pina and Pinb gene probes. We estimated the copy number of Pina or Pinb genes in tester species by counting the number of fragments detected by Southern hybridization. The data was tabulated to determine if the haplotype structure was conserved or there were null haplotypes for either one or both the Pin genes at the Ha locus (Table 1 and   Supplementary Table 1s). Null haplotypes were further characterized according t o the size of the deletion either by Southern analysis using additional gene probes that mark the Ha locus (see Fig. 1

Diploid species:
We randomly selected at least two accessions from each of the 12 diploid species of Aegilops and Triticum (a total of 90 accessions, Table 1 and   Supplementary Tables 1s) for the haplotype survey. In all cases, Southern hybridization detected a single band or rarely multiple bands for Pina and Pinb gene probes, indicating that haplotype structure at the Ha locus is conserved in the diploid species. A single-copy of Pina and Pinb was detected in A -and D -genome donor species of polyploid wheat (Table 1). Five Aegilops species share the S genome and all except Ae. speltoides are self-pollinated. All self-pollinated S -genome species had one copy of Pina and Pinb.
Most accessions of Ae. speltoides also carry one copy of the Pin genes and the observed multiple Southern hybridization fragments in some accessions (Supplementary Table 1s) may b e due to either heterozygosity, because it is a cross-pollinated species, or rarely from the presence of intragenic restriction sites or gene duplication. All the C-, M-, Uand N-genome species also had one copy of the Pin genes except for one accession of Ae.
comosa where Southern analysis indicated multiple gene copies.

Tetraploid species:
The above-mentioned diploid species have contributed genomes to tetraploid Triticum and Aegilops species, and two copies each of Pina and Pinb genes are expected in the genomes of these tetraploid species (Table 1). The tetraploid wheat species T. turgidum (AB) and T. timopheevii (AG) form the A -genome cluster. We screened 92 accessions of T. turgidum including eight subspecies representing the range of wild and domesticated forms. All showed the null haplotype for the puroindoline genes (Table 1 and Supplementary Table 1s We screened 65 accessions of T. timopheevii including two subspecies representing the range of wild and domesticated forms (Supplementary Table 1s). All carried only one copy of the Pina and Pinb genes, indicating null haplotype at the Ha locus for one of its genomes (Supplementary Table 1s). Gene5, which lies between Pina and Pinb, was, as expected, present in one copy. The Gsp probe detected two copies indicating that one of the breakpoints that produced the null haplotype is located between Gsp and Pina (Fig.   3).  chromosome 5D of Red Egyptian. Therefore, at least two independent deletion events occurred at the Ha-D locus in common wheat, one with the distal breakpoints between Gsp and Pina in Sea Island and Komar and another haplotype with distal breakpoint beyond BGGP in Red Egyptian.
We surveyed three accessions of T. zhukovskyi (AA m G), the second A -genome cluster hexaploid species. It is expected to have two copies of Pin genes, one from T.
timopheevii and the second from T. monococcum. However, only one copy of the Pin genes was detected, indicating the presence of a second null haplotype at the Ha locus in one of its genomes (Fig. 4). Because T. zhukovskyi is autoallohexaploid, the loss of a Ha locus could be due to either recombination between A and A m genome or a deletion event. This question can be resolved based on BGGP and Gsp hybridization patterns: 1) if each detects three bands, Pina and Pinb were deleted from one genome; 2) if two bands with similar intensity are observed, all BGGP, Gsp, Pina and Pinb were deleted from the A or A m genome; and 3) if the two bands differ significantly in intensity, A -A m recombination, instead of deletion, occurred. Our r esults support the second scenario ( Fig. 4), i.e., BGGP, Gsp, Pina and Pinb were deleted from the A or A m genome, similar to the Ha-D haplotype in Red Egyptian.
We surveyed three D -genome cluster and one U -genome cluster hexaploid Aegilops species and all are expected to have two to three copies of Pina and three copies of Pinb, depending upon the genotype of the tetraploid parent (see above). Ae. crassa (DDX) and

Sequence analyses of unique haplotypes
A sequence analysis of deletion haplotypes detected in polyploid wheat species was used to further characterize and allocate their genomic origin. These results are summarized in  Fig. 1s and 2s). In the 3'UTR of Pinb, an 88-bp fragment, spanning the stop codon and polyadenylation signal, was found in triplicate in Pinb-A of T. timopheevii compared to its ancestor T. urartu. The repeat members are identical except for a single nucleotide polymorphism (SNP) (Supplementary Fig. 2s). A PCR assay showed that the 88-bp triple repeat is fixed at the species level (Fig. 5) Pinb haplotypes were conserved in both amphiploids (Fig. 6)  This 21-bp deletion was obviously caused by unequal crossing over between the 11-bp direct repeats (Supplementary Fig. 10s) and led to a loss of seven amino acids (WYNEVGG) in the PINB protein. The A m A m S sh S sh amphiploid used is from the S2 generation, the unequal crossover either occurred during female meiosis of the Ae.
sharonensis parent TMB02 or happened and was rapidly fixed after polyploidization.

DISCUSSION
The most remarkable observation on the structure and evolution of the Ha locus in wheat and the Triticeae is the absolute conservation of the locus in diploid species reported here and in previous papers (Gautier et al. 2000, Lillemo et al. 2002, Massa et al. 2004, Chen et al., 2005, Simeone et al. 2006) and recurrent and independent deletions in the polyploid Triticum and Aegilops species. To date, more than 200 accessions from the two diploid Triticum and ten diploid Aegilops species have been analyzed, and not a single case of deletion polymorphism at the Ha locus has been reported. Especially, no deletion polymorphisms have been detected in a diverse sample of more than 130 accessions of the A -, B -and D -genome donor species of polyploid wheat. This is in contrast to frequent deletion haplotype polymorphisms for a defense-gene cluster in Dgenome diploid, Ae. tauschii (Brooks et al. 2006). Against this high rate of deletion polymorphism in polyploid species, not a single case of insertion-deletion polymorphism was documented in a sample of Pina and Pinb sequences from 50 accessions of diploid Ae. tauschii representing its geographical diversity (Massa et al. 2004). All polyploid wheats and most polyploid Aegilops species harbored deletion haplotypes of independent origin at the Ha locus. So how does a gene that is essential in a diploid suddenly become deleterious and must be deleted in a polyploid? To begin to answer this question, some discussion about the nature of the puroindoline genes, their function, the nature of gene action in polyploids and the mechanisms of polyploid genome evolution and speciation that promote expression and evolution of novel traits is needed. Amino acid sequence analysis has shown that numerous storage proteins, including low-molecular-weight glutenin, alpha-/beta gliadins, lipid transfer proteins, chymotrypsin inhibitor WCI, alpha-amylase/trypsin inhibitor, GSP, PINA and PINB, belong to the alpha-amylase inhibitors (AAIs) and seed storage (SS) protein subfamily, because they have an AAI-SS domain. AAIs play an important role in the natural defense of plants against insects and pathogens mainly by inhibiting alpha-amylases and proteases.
Puroindolines have bactericidal (Jing 2003) and fungicidal activities (Krishnamurthy et al. 2001). PINA and PINB proteins directly bind to the surface of starch granules in the endosperm cells and form a friabilin complex. Using isogenic lines, Swan et al. (2006) showed that puroindolines seem to protect starch from microbial digestion, and the increased expression of PIN proteins decreased the starch digestibility of wheat in the rumen by up to 30%. Alpha-amylase is an important enzyme in starch metabolism and induced in the aleurone by gibberellic acid from the embryo during germination.
Conceivably, wheat starch can be protected from alpha-amylase digestion by AAI activity of the PIN proteins. Therefore, because of the important role of PIN proteins in plant defense and seed physiology, Pina and Pinb genes may be under strong selection pressure and are maintained in all the diploid species.
One of the consequences of polyploidy is doubling and tripling of gene copy number and, thus, the amount of proteins may be doubled or tripled for some of these genes. This dosage response has been demonstrated for the Pin genes and a super soft hexaploid wheat genotype has been created (See et al. 2004). Because wheat starch is protected from alpha-amylase digestion by AAI activity of the PIN proteins, we hypothesize that the sudden dosage-driven increase in expression levels of puroindoline genes in polyploid would impede the embryos from obtaining nutrition from the endosperm. The situation may be more severe when polyploid plants are under abiotic stress, such as heat and drought during grain filling, which adversely affects endosperm development. Point mutations in Pinb can liberate the PIN proteins from binding to the starch granule surface and cause significant difference in grain texture (Giroux and Morris 1997, Morris et al. 2001, however, the PIN proteins with the AAI activity remain in the endosperm cells. Therefore, deletion of Pina-Pinb genes provides the most efficient mechanism to reduce AAI activity and is the least detrimental, because they are structural genes. The