The Rice Miniature Inverted Repeat Transposable Element mPing The Rice Miniature Inverted Repeat Transposable Element mPing Is an Effective Insertional Mutagen in Soybean Is an Effective Insertional Mutagen in Soybean

,

Soybean (Glycine max) is a key component of modern agriculture due to the high protein and oil content of its seed and the lower fertilizer inputs required because of its nitrogen-fixing capacity (Singh and Shivakumar, 2010).The desire to understand the underlying genetics of these traits has prompted the recent sequencing of the soybean genome (Schmutz et al., 2010).Sequence annotation predicted at least 46,430 genes (Schmutz et al., 2010), with another study identifying up to 55,616 (Libault et al., 2010).While homology to characterized proteins can be used to predict gene function for a few genes, the vast majority of the genes are uncharacterized.Determining gene function using gene silencing or overexpression strategies is feasible, however the lack of high-throughput transformation in soybean limits the scope of these approaches.Therefore, alternative tools for identifying soybean genes are needed.
Insertional mutagenesis using T-DNA can be a powerful tool to connect genotype to phenotype, when coupled with high-throughput transformation as was done with Arabidopsis (Arabidopsis thaliana).The absence of a comparable transformation system for soybean makes T-DNA tagging impractical.However, insertional mutagenesis by an active transposable element would facilitate soybean gene discovery.The only transposon-tagging tool that has been characterized for soybean is the Ac/Ds system (Mathieu et al., 2009).The Ds element primarily produces linked insertions in maize (Zea mays), tobacco (Nicotiana tabacum), and Arabidopsis (Dooner and Belachew, 1989;Jones et al., 1990;Bancroft and Dean, 1993).For Ds to be an effective mutagen in soybean, many Ds elements would have to be inserted throughout the genome by transformation.While this was possible in Arabidopsis where transformation is very efficient (Muskett et al., 2003;Nishal et al., 2005), it is not a viable strategy in soybean.In another legume species, Medicago truncatula, the Tnt1 retroelement is an effective mutagen (d 'Erfurth et al., 2003).However activation of Tnt1 usually requires tissue culture (d 'Erfurth et al., 2003), which limits throughput and has the potential to induce other genomic or epigenetic changes (Kaeppler and Phillips, 1993).
A transposon that exhibits many of the desired traits for transposon tagging is the mPing element from rice (Oryza sativa; Jiang et al., 2003;Kikuchi et al., 2003;Nakazaki et al., 2003).mPing is a 430-bp miniature inverted repeat transposable element that transposes at a high frequency and has reached high copy number in some rice cultivars (Naito et al., 2006).It is a nonautonomous deletion derivative of the Ping element, which lacks the two open reading frames (ORF1 and Transposase [TPase]) required for transposition.In Arabidopsis, mPing was mobilized by expressing the ORF1 and TPase proteins from either Ping or the closely related Pong element (Yang et al., 2007).Most of the resulting insertions were located near genes (68.6% , 1 kb from a gene) and were shown to be unlinked to the original transgene.Unlike most DNA transposons that often cause indels (called footprints) at the site of excision, mPing excision sites are repaired precisely at a high frequency (99% and 82%) in both yeast (Saccharomyces cerevisiae; Hancock et al., 2010) and Arabidopsis (Yang et al., 2007), respectively.These transposition characteristics indicated that mPing could be suitable for transposon tagging, with the caveat that heritable insertions had not been shown in hosts other than rice.
The objective of this work was to evaluate the suitability of mPing as a mutagenesis tool in soybean.The analysis included identifying heritable mutations and their frequency, characterizing the insertion site preference, and determining the extent to which precise excision of mPing occurs.

RESULTS mPing Excision and Transformation
The pICDS-mP plasmid developed by Yang et al. (2007) was adapted for soybean transformation by changing the selectable marker from kanamycin to hygromycin resistance.The resulting plasmid, named pPing (Supplemental Fig. S1), contains an mPing gfp reporter construct (Fig. 1A), which only expresses gfp upon excision of the mPing element (Yang et al., 2007).The Ping proteins are expressed from a Ping cDNA containing the ORF1 and TPase coding regions (Yang et al., 2007).After transformation of soybean embryogenic tissue, hygromycin-resistant clusters were selected and assigned event numbers.PCR analysis was used to identify 10 events that were PCR positive for ORF1, TPase, and mPing.
PCR based on primers that flank the mPing element was used to detect transposition, as a smaller amplicon is produced after element excision (Fig. 2A).Two developmental stages of somatic embryo development (globular and cotyledonary [Fig.1B]) were collected during plant regeneration and tested for mPing excision.Figure 2B shows that at the globular stage, seven out of 10 lines have a single 778-bp band, indicating that mPing is still in its original position.However, three transgenic events (2-9, 3-3, and 2-24) produced additional 345-bp bands that reflect mPing transposition from the pPing construct.At the later cotyledonary stage, eight of the 10 lines have the 345-bp band, indicating an increased capacity for mPing excision later in embryo development (Fig. 2B).In addition, cotyledonary-stage somatic embryos from two transgenic events show no PCR products with these primers (Fig. 2B, events 2-9, 2-24).This could be due to, among other possibilities, the loss of one or both primer binding sites following mPing excision.
At least three plants were regenerated from tissue culture for each event and assigned a letter (e.g.event 3-3 plant A).Most regenerated plants tested for mPing excision by PCR had a single 778-bp product (Fig. 2C).However, two out of three plants produced from event 2-9 lacked the 778-bp band, but had a smaller band (295 and 399 bp).The absence of the larger band indicates that mPing excised from the reporter in all of the embryogenic cells from which the plant was derived.Five other plants (two from event 2-11 and three from event 2-10) produced a PCR product with terminal mPing primers but no product with mPing flanking primers (Fig. 2C).Follow-up PCR using primers to gfp shows that event 2-10 initially had this region, but it was lost during generation of plants (Fig. 2D).These data suggest that a mutation occurred in the mPing reporter, probably due to excision events that removed flanking sequences.These results also show that in all mPing Transposition in Soybean cases where mPing excised from the reporter, it was not lost from the genome, but instead, had inserted in a new location.
Sequencing the lower bands resulting from PCR with primers flanking the mPing element allowed for analysis of excision sites.Precise excision was detected from both repetitive globular and cotyledonary-stage somatic embryos (Supplemental Fig. S2).However, 30% (7/23) of the excision events resulted in small deletions (7-32 bp).The excision sites analyzed in plants 2-9 B and 2-9 C showed a deletion and an insertion, respectively (Supplemental Fig. S2).The prevalence of deletions at the excision site, leading to the loss of sequences flanking the mPing excision site, is consistent with the lack of PCR products for some events in Figure 2, B and C.

GFP Expression
The pPing construct is designed to express gfp upon excision of mPing.Screening the transgenic soybean lines at both the globular and cotyledonary stage for GFP fluorescence identified a single event (3-3) with detectable levels of gfp expression (Fig. 3).The lack of detectable fluorescence in the other lines shown to have mPing excision could result from a number of factors including transposition in a limited number of cells, disruption of the construct during biolistic transformation, and disruption of the construct after mPing excision.The use of PCR to test if the entire mPing reporter was present at the globular stage indicated that only four of the events had the complete reporter, including the promoter and terminator (Supplemental Fig. S3).Of these four, event 3-3 was the only one to show GFP fluorescence (Fig. 3).Consistent with the PCR results, event 3-3 showed fluorescence in both the globular-and cotyledonary-stage embryo.In fact, when an event 3-3-derived plant was tested for gfp expression, one out of five meristems showed fluorescence, indicating that transposition occurred in one branch of this plant during its growth (Fig. 3, gfp expressing leaf shown).(Jiang et al., 2003).Plants produced from nine independently derived pPing events were tested for mPing transposition (Fig. 4A, data for three events shown).Eight of the nine events showed few if any strong unique bands that would indicate transposition.For example, tissue obtained from globular-stage embryos and plants derived from event 3-13 and 3-3 show one to three strong bands per sample.In contrast, event 2-9 shows a continuum of bands at the globular stage, and a unique pattern for all three plants tested (Fig. 4A).Because these plants were derived from a single cell line, this pattern indicates that mPing was active during embryo development or subsequent growth of the plants.
Figure 4A (right side) also shows the transposon display results for the 2-9 B T1 progeny, with two separate leaves tested from each plant (first and third trifoliolate leaves in adjacent lanes) to allow for differentiation between localized and widespread insertions.Eleven of the darker bands present in the 2-9 B parent were inherited by the progeny in Mendelian fashion (numbered arrows).In addition, 2-9 B T1 plants 1 and 8 show strong additional bands (black arrowheads) that were not observed in the 2-9 B parent.These unique insertions are present in the two leaves tested from each plant, indicating the mPing insertions occurred before the initiation of both leaves.These insertions must have occurred in the parental tissues that gave rise to the gametes, because these plants did not inherit both ORF1 and TPase.In contrast, progeny numbers 2, 3, 5, and 7 have both ORF1 and TPase, and show a large number of relatively weak bands that in most cases are specific to just one leaf (Fig. 4A, boxed region highlights an example).The characteristics of these bands are consistent with localized sectors derived from somatic mPing transposition in the leaves.A similar transposon display was performed for the progeny of plant 3-3 A, showing a single heritable insertion was produced by this plant (Supplemental Fig. S4).
Cloning and sequencing 79 transposon display bands verified that most are the expected mPing terminal and adjacent soybean sequence (Supplemental Table S1).Only one was found to result from mispriming to the soybean genome, and two were composed of concatamers of primers.Six of the nine soybean events did not have a transposon display band resulting from pPing because there are no MseI sites close enough to the mPing element in the construct.However, because the pPing vector disintegrated into several sections during transformation, three events showed strong ubiquitous bands that correspond to an untransposed mPing element adjacent to MseI sites in the soybean genome (see white square in Fig. 4A event  3-3).These are easily distinguished from true transposition events by the presence of vector DNA flanking mPing.Sequencing of three faint background bands present in the negative control indicated that they result from primer concatenation during the amplification steps (i.e.fusion of the mPing-specific primer to the adapter primer or genomic fragment).
To provide additional evidence for reliability of the transposon display, five mPing insertions identified in Figure 4A were verified by PCR with primers designed to flank the genomic location of the insertion (Fig. 4B; Supplemental Table S2).For example, the sequence of band 1 was used to determine the genomic location of the mPing insertion, allowing for a primer to be mPing Transposition in Soybean designed that is specific to a region beyond the sequence obtained by transposon display.As expected, PCR with this primer and an mPing-specific primer shows the same pattern as observed for transposon display (Fig. 4, A and B).Similarly, primers flanking both sides of the heritable mPing insertion site from plant 3-3 A produced a larger (with mPing) and smaller band (without mPing) that segregated in the 3-3 A progeny (Supplemental Fig. S4B), confirming the reliability of transposon display and indicating whether the insertion was present in one or both copies of the chromosome.
To determine if mPing activity continues in the next generation, transposon display was performed on 2-9 B T2 plants that contained both the Ping ORF1 and TPase genes (Fig. 5A).The progeny from each of the four parents show novel bands that shared between multiple siblings, indicating mPing insertions that occurred in the T1 parent plants after our sampling of the initial leaves.In addition, there is a strong band present in one of the 2-9 B5 progeny, suggesting that an insertion occurred either relatively late in the T1 parent or early in the development of the T2 plant.A subset of the insertions novel to the T2 generation was tested by PCR and shown to be consistent with the transposon display pattern (Fig. 5B).Overall, the observed insertions show that mPing transposition activity continues during plant growth and produces at least one new germinal insertion per generation.

Transgene Analysis
The two lines with verified heritable insertions (2-9 and 3-3) were characterized further to determine the number of copies of the original transgene.PCR analysis of plant 3-3 A T1 progeny (Supplemental Fig. S6 and additional samples) showed that the ORF1 and TPase genes show different segregation patterns.However, the inheritance of these genes is not consistent with two unlinked loci (n = 28, x 2 = 11.6, 3 degrees of freedom [d.f.], P = 0.009), suggesting that there are multiple copies of at least one of the transgenes.In contrast, the segregation of the ORF1 and TPase genes in the 2-9 B T1 progeny is consistent with two unlinked loci (n = 24, x 2 = 2.1, 3 d.f., P = 0.55; Supplemental Fig. S5 and additional samples).Southern-blot analysis of 2-9 B progeny (Fig. 6) shows that the hph probe hybridizes with two genomic fragments (approximately 6,000 and 14,000 bp) and the gfp probe hybridizes with a single approximately 8,000-bp band with a different segregation pattern than that of hph.Together these data show that for event 2-9, pPing broke into at least two pieces, which inserted as separate loci (one with the mPing donor site and TPase gene, the other harboring the hph and ORF1 genes).In contrast to the single-copy transgene, hybridization with an mPing probe produced seven to 12 bands for each plant, implying both transposition and increase in copy number.

Insertion Site Analysis
To determine the insertion pattern of mPing in soybean, the location of the mPing insertions were determined based on the flanking sequences obtained from transposon display.Four transposon display bands could not be definitively placed on the published soybean sequence, suggesting they were located in unsequenced or repetitive regions.The remaining 72 mPing insertions from embryo and plants (both germinal and somatic events) were located at unique sites in the soybean genome (Supplemental Table S1).Forty eight of these sites were from the T0, T1, and T2 plants generated from event 2-9.These insertions are on 18 of the 20 soybean chromosomes, with only three in the annotated pericentromeric region (Fig. 7).Similarly, 21 insertion sites from plant 3-3 A were mapped to 10 chromosomes with only one pericentromeric insertion.
The sequences flanking the mPing insertions were used to create a pictogram indicating frequency of each base at the insertion site (Fig. 8B).To further characterize the insertion preference, the observed insertion locations were categorized according to their insertion site (i.e.exon, intron, untranslated region [UTR], intergenic; Fig. 8A).These results were compared to a representation of the actual genome composition, created by producing 100,000 simulated random insertions into the soybean genome.Eighty-four percent of the mPing insertions are within 5 kb of a predicted gene transcript, compared to an expected frequency of 47% if insertion sites were random.There is also a corresponding reduction in intergenic insertions (.5 kb from a gene) compared to the control (16.4% versus 53.1%).The distribution of the observed insertions is significantly different from the random insertion pattern (G test; G = 52.934,6 d.f., P , 0.0001).The majority of this deviation is due to an overrepresentation of insertions within 2.5 kb up-or downstream of predicted transcripts (55.2% versus 17.9%).
As indicated by our insertion preference, a number of insertions in genes were identified (Supplemental Table S1).The insertions that were identified in viable plant lines include exon insertions that may disrupt function of a 60S ribosomal gene and a gene of unknown function.In addition, intron and UTR insertions were identified in calmodulin binding, atpob1 homolog, homeobox, peroxidase, and unknown function genes.Insertions into regions that may contain promoter sequences (,1,000 bp upstream) were found for an additional seven genes (Supplemental Table S1).

DISCUSSION
The development of soybean into a model for legume-specific processes will require the development Figure 6.Southern-blot analysis of event 2-9 B progeny.Three identical sets of DNA (20-22, 24 are T1 generation, 2-32 is a T2 plant that is homozygous for both hph and Ping TPase) were probed with DNA from three separate regions of the pPing plasmid.NEG, Untransformed control; M, 1Kb+ DNA ladder.
of improved gene discovery tools, such as transposon tagging.As mentioned, the transposon-tagging systems currently available for legumes, Ac/Ds and Tnt1, have characteristics that limit their widespread use for gene discovery.In contrast, mPing transposition in soybean does not exhibit these features.Transposition of mPing occurred under normal plant growth conditions over at least two generations (Figs. 4 and 5).Also, the mPing insertions emanating from the mPing reporter construct were not limited to linked sites (Fig. 7), consistent with the observation of unlinked transposition for somatic events in Arabidopsis (Yang et al., 2007).These features make it possible to saturate the genome with insertions by simply growing a population of plants with mPing activity, a relatively easy process compared to the tissue culture required for Tnt1 transposition or the transformation and crossing used for Ac/Ds mutagenesis.The relative ease with which mutants can be generated allows the efforts to be focused on mutant analysis.
The ability to produce germinal transposition events is required for tagging because it allows for subsequent genetic analysis of mutants.To our knowledge, this study is the first to verify that mPing produces heritable insertions in a species other than rice (Figs. 4 and 5).When different embryo developmental stages (Fig. 2) were characterized for transposition, more activity was observed in the cotyledonary stage (8/10 versus 3/10 at globular stage), suggesting that transposition may occur preferentially in some developmental stages.If so, understanding the developmental regulation may indicate ways to control transposition.In the mean time, this study showed that there is no transposition in progeny where the genes encoding Ping proteins are removed by segregation.This is evident in Figure 4 for the progeny of 2-9 B that lack either of the Ping proteins.The ability to effectively freeze mPing insertions in the genome will simplify the genetic analysis of mPing insertion mutants.
The locations of mPing insertions in soybean (Fig. 8) have similarities with the insertions observed in rice (Naito et al., 2009).These include a reduced preference for intergenic regions and a preference for insertion into gene-rich regions.The mechanisms underlying these patterns are unknown; however one possibility is that mPing preferentially inserts into open chromatin as has been hypothesized for other elements (Kuromori et al., 2004;Liu et al., 2009).Chromatin compactness is directly related to the frequency of nucleosomes, the basic unit of DNA packing around histone proteins.Exons show higher nucleosome density than introns (i.e.Arabidopsis [Chodavarapu et al., 2010)], Caenorhabditis elegans [Valouev et al., 2008], and humans [Schones et al., 2008]).In rice, mPing exhibits an exon avoidance mechanism, reducing the exon insertion rate to 14% of that expected for random insertion, while insertion into introns is 51% of expected (Naito et al., 2009).This insertion pattern is consistent with the hypothesis that chromatin structure affects mPing insertion.However, another key difference between rice introns and exons is the average G/C content of only 37% for introns compared to 51% for exons (Yu et al., 2002).Accordingly, analysis of the mPing flanking sequences shows a preference for insertion into T/A-rich regions (Fig. 8B; Naito et al., 2006;Yang et al., 2007;Hancock et al., 2010).If the preference for insertion into T/A-rich sequences is involved in exon avoidance, a greater number of insertions into exons is expected in genomes like soybean that have exons with lower G/C content than rice (43% versus 51%, respectively; Yu et al., 2002;Tian et al., 2004).In fact, comparing our sampling of soybean insertions to the expected number under the random insertion model shows no significant exon avoidance (x 2 = 0.002, d.f.= 1, P .0.9), unlike in rice where the exon avoidance is highly significant (x 2 = 81.2,d.f.= 1, P , 0.0001; Naito et al., 2009).
The observed rate of mPing insertions into genes normalized to gene density is comparable to the  S3; Tissier et al., 1999;Courtial et al., 2001;d'Erfurth et al., 2003;Kuromori et al., 2004;Tadege et al., 2008).However, the characteristic that gives mPing an advantage is its preference for insertion near genes (51.4% of insertions are within 2.5 kb; Fig. 8A).This preference is ideal for activation tagging, which upregulates expression by placing enhancer sequences in close proximity.An mPing-based activation tag should be particularly advantageous for determining the gene function in species like soybean, which has a high degree of genome duplication.While its ability to be modified to serve as an activation tag still remains to be investigated, there is no a priori reason why it should not work, as has been achieved with the Ds element (Qu et al., 2008).If successfully developed into an activation tag, then mPing will be an incomparable resource for transposon mutagenesis for crop genomics.

CONCLUSION
The mPing miniature inverted repeat transposable element produces heritable insertions in soybean over multiple generations.It retains the transposition characteristics that are favorable for transposon tagging.These advantages include the ability to produce unlinked insertions without tissue culture and the strong preference for insertion in and near genes.Thus, mPing appears to overcome most of the limitations that impede other transposon-tagging systems in place today and will facilitate gene identification in soybean.

Vector Construction and Transformation
The pPing vector was constructed by subcloning the HindIII-SacI fragment from the pICDS-mP plasmid (Yang et al., 2007) into a SacI site that precedes a nos terminator in the pUHN4 vector (includes a StUbi3 promoter: hph gene: nos terminator selectable marker; Joshi et al., 2005).Somatic embryos from soybean (Glycine max) cultivar Jack (Nickell et al., 1990) were prepared and transformed as described by Trick et al. (1997), with modifications.Three plates of repetitive globular-stage embryos were bombarded at 7,584 kPa (1,100 c) with 42 ng of plasmid DNA precipitated on 0.55 mg of 0.6-mm diameter Au.Transgenic lines were selected using FNL medium (Samoylov et al., 1998) supplemented with 20 mg mL 21 hygromycin-B.Selected lines were given an event number (shot number-event number) and their transgenic status verified by PCR before plant regeneration.Cotyledonary-stage embryos were produced in SHaM medium (Schmidt et al., 2005), desiccated for 1 week, and germinated on MSO medium as described by Parrott et al. (1988).

PCR Analysis
Genomic DNA was purified using the C-TAB method (Murray and Thompson, 1980), quantitated with the fluorescent DNA quantitation kit (Bio-Rad), and diluted to 5 ng mL 21 .GO Taq polymerase (Promega) was used for PCR genotyping (10 ng genomic DNA per reaction).The DNA of all samples was tested for quality by ensuring it was possible to amplify the soybean lectin

Figure 1 .
Figure 1.pPing construct and embryo developmental stages.A, Diagram of the Ping-and mPing-containing regions of the pPing plasmid used for soybean transformation (35Sp, cauliflower mosaic virus 35S promoter; nost, nopaline synthase terminator; gfp, green florescent protein).B, Representative images of the two sequential and distinct developmental stages of soybean embryo tissue culture that were harvested to test for mPing transposition.

Figure 2 .
Figure 2. PCR analysis of pPing-containing soybean lines.A, Diagram showing the position of the primers used to detect mPing excision (Flank For and Rev) and the presence of mPing (mPing 5# and 3#).B, PCR products from two embryo development stages and leaf tissue from germinated plants (C).D, PCR results with primers to part of the gfp gene for globularstage embryos (G) and leaf tissue for selected events.Controls with (+) and without (2) mPing show the expected band sizes (778 and 345 bp).NEG, Untransformed control; M, 100-bp DNA ladder.

Figure 3 .
Figure 3. Images of GFP expression detected in event 3-3 herbicidebleached tissue.The globular-stage example contains both normal and GFP-expressing tissue, while the cotyledonary stage and immature leaf have untransformed controls for comparison.

Figure 4 .
Figure 4. T0 and T1 progeny analysis.A, Transposon display results for the repetitive globular-stage (G) and three plants (A, B, C) regenerated from the 3-13, 3-3, and 2-9 transgenic events.Samples 1 thorough 9 are the T1 progeny produced by selfing plant 2-9 B (the two lanes for each plant are from DNA isolated from two leaves).Numbered arrows indicate insertions that were present in the 2-9 B plant and inherited by subsequent progeny.The white square indicates a band that results from a nonmobilized mPingcontaining transgene fragment.Black arrowheads indicate plant-specific insertions.Boxed region indicates examples of somatic insertion patterns that are specific to one or the other leaf.The presence of the TPase and ORF1* transgenes in each plant are shown below the transposon display gel (see Supplemental Fig.S5).B, PCR analysis of 2-9 B progeny plants using primers that detect the presence of the mPing insertions identified by transposon display.Band numbers correspond to the numbered arrows above.M, 100-bp ladder; NEG, untransformed control.

Figure 5 .
Figure 5. T2 progeny analysis.A, Transposon display of T2 progeny from 2-9 B T1 plants with both the ORF1 and TPase transgenes.Numbered arrows correspond to the bands present in the previous generation (Fig. 4).Black arrowheads denote novel insertions and their genomic locations are indicated on the right.B, PCR verification of a subset of the novel mPing insertions.M, 100-bp ladder; NEG, untransformed control.

Figure 7 .
Figure 7. mPing insertion sites: Location of mPing insertion sites (arrowheads) in the soybean genome identified from both globularstage embryos and leaf tissue for two transgenic lines (event 2-9 and 3-3).Gray = pericentromeric regions, dark circle = centromere.[See online article for color version of this figure.]

Figure 8 .
Figure 8. Insertion site analysis.A, Histogram comparing the observed mPing insertion frequency to randomly generated insertions in the soybean genome.B, Pictogram representing the frequency of each nucleotide at the mPing insertion sites in soybean and rice(Naito et al., 2006).