|
|
||||||||
|
First published online May 21, 2004; 10.1104/pp.104.041061 Plant Physiology 135:630-636 (2004) © 2004 American Society of Plant Biologists TILLING. Traditional Mutagenesis Meets Functional GenomicsBasic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109 (S.H., B.J.T.); and Department of Biology, University of Washington, Seattle, Washington 98195 (L.C.)
Most of the genes of an organism are known from sequence, but most of the phenotypes are obscure. Thus, reverse genetics has become an important goal for many biologists. However, reverse-genetic methodologies are not similarly applicable to all organisms. In the general strategy for reverse genetics that we call TILLING (for Targeting Induced Local Lesions in Genomes), traditional chemical mutagenesis is followed by high-throughput screening for point mutations. TILLING promises to be generally applicable. Furthermore, because TILLING does not involve transgenic modifications, it is attractive not only for functional genomics but also for agricultural applications. Here, we present an overview of the status of TILLING methodology, including Ecotilling, which entails detection of natural variation. We describe public TILLING efforts in Arabidopsis and other organisms, including maize (Zea mays) and zebrafish. We conclude that TILLING, a technology developed in plants, is rapidly being adopted in other systems.
Large-scale DNA sequencing projects have changed the way that biology is performed. The traditional pursuit of a gene starting with a phenotype has given way to the opposite situation: most of the genes are known from sequence, but most of the phenotypes are obscure. Thus, reverse genetics has become an important goal for many biologists, and new technologies are in great demand (Nagy et al., 2003
Over the past few years, we and our colleagues have been developing a general strategy for reverse genetics that we call TILLING (for Targeting Induced Local Lesions in Genomes; McCallum et al., 2000
The impetus for TILLING arose from a graduate student's frustration with the limitations of reverse-genetic methods available for Arabidopsis in the late 1990s. The student, Claire McCallum, went on to demonstrate the feasibility of TILLING by discovering mutations in two chromomethylase genes that were the subject of her thesis research (McCallum et al., 2000
The original TILLING method used a commercial denaturing HPLC (DHPLC) apparatus for mutation discovery. However, we anticipated that this method would not scale up easily, and so we looked at alternative technologies. A method for enzymatic mismatch cleavage described by Tony Yeung seemed particularly attractive (Oleykowski et al., 1998
For TILLING Arabidopsis, seeds are mutagenized by treatment with ethylmethanesulfonate (EMS). The resulting M1 plants are self-fertilized, and M2 individuals are used to prepare DNA samples for mutational screening, while their seeds are inventoried and sent to the Arabidopsis Biological Resource Center (ABRC) for eventual distribution. The DNA samples are pooled and arrayed in microtiter plates, and the pools are amplified using gene-specific primers. Amplification products are incubated with the CEL I endonuclease, a member of the S1 nuclease family of single strand-specific nucleases (Oleykowski et al., 1998
Upon detection of a mutation in a pool, the individual DNA samples are similarly screened to identify the individual carrying the mutation. This rapid screening procedure determines the location of a mutation to within ±10 bp for PCR products that are 1-kb in size. For the current mutagenized Arabidopsis populations that we are using, we find a density of 1 mutation per 235 kb, or approximately 4 point mutations per 8-fold pool gel (representing 768 plants; Greene et al., 2003 A key advantage of high-throughput TILLING over competing methods is that the approximate position of each detected mutation is inferred from the size of the fragment, which greatly facilitates subsequent sequencing. Furthermore, the double-end labeling strategy provides confirmation within the pool screen, and further confirmation comes from identifying the same fragments in tracking down individuals. Therefore, sequencing is done with near certainty that a mutation exists within a small interval. Examination of a sequencing gel trace in the predicted location suffices to identify the mutated base and the substitution, and we use Sequencher trace analysis software (Gene Codes, Ann Arbor, MI) to facilitate this step. We have identified >3,000 Arabidopsis mutations in this way, typically using the readout from only the strand in which the primer is closer to the detected mutation. By contrast, methods that do not provide an approximate location for a detected mutation, such as DHPLC, require that the full amplified segment be interrogated by sequencing, and for a 1-kb segment this would require multiple runs to be carefully scrutinized. Detection of heterozygotes under such circumstances can be challenging, especially when peak heights vary, and false positives will greatly exacerbate this problem.
The high densities of EMS mutagenesis that we aim for raise concerns about background mutations being mistaken for mutations in target genes during phenotypic analysis. However, EMS-generated mutations at densities comparable to those in TILLING lines continue to be a basic learning tool for genetics, where background mutations obviously have not been a problem. On the one hand, mutations in genes expected to impact a phenotypic trait controlled by many genes, such as plant height or size or leaf shape, may be subject to epistatic interactions, and outcrossing to the wild type may be necessary. On the other hand, mutations in genes expected to impact a phenotype that is controlled by few genes are unlikely to produce phenotypes perturbed by background mutations, and outcrossing is not a prerequisite for analysis (Henikoff and Comai, 2003 Based on mutation densities that we have measured in TILLING Arabidopsis and considering overall recombination rates, we have estimated that the probability of a closely linked lesion to be mistaken for one in the target gene is only approximately 0.0005. Furthermore, crossing members of the allelic series will bring together two independently mutagenized genomes, and so by typing and looking for a correlation between the heteroallelic pair and the recessive phenotype, a researcher can further reduce concerns about background mutations being mistaken for mutations in the target gene. In conclusion, many phenotypes can usually be scored unequivocally in M3 populations. In certain cases, outcrossing might be necessary, but it should be possible to score most phenotypes after one or at most two generations. These strategies are the same employed in forward genetic screens for the past three-quarters century.
The high-throughput potential of TILLING led to the establishment of a TILLING facility in Seattle for the Arabidopsis community at large, the Arabidopsis TILLING Project (ATP; Till et al., 2003 The scientist pursuing the function of this gene would find it advantageous to use TILLING. A search for mutations would be initiated, yielding approximately 10 mutations typically delivered 2 to 3 months later. Among these, our scientist would have a high probability of finding hypomorphic alleles. If this does not suffice, then all the available TILLING lines (approximately 7,000) could be searched, which would provide approximately 25 different point mutations, half of which on average would be missense.
In a significant minority of cases, there will be no available T-DNA insertion in the gene of interest. In such cases, TILLING could be employed to find knockout alleles, i.e. truncations. Ten TILLING mutations have an approximately 40% probability of including at least one truncation and 25 mutations have an approximately 70% probability, estimates that have been confirmed by analysis of the TILLED mutation set (Greene et al., 2003 For a user, TILLING begins with a visit to the ATP Web site (http://tilling.fhcrc.org:9366), where she follows instructions for the interactive Web-based program CODDLE (for Codons Optimized to Detect Deleterious Lesions, http://www.proweb.org/coddle). CODDLE assists in all steps from selecting the gene region to ordering, after which TILLING begins. When mutations are discovered, confirmed, and sequenced, the user is automatically notified and sent to a Web page for coding and restriction site analyses and stock information. The series is also sent to The Arabidopsis Information Resource (TAIR; http://Arabidopsis.org) and formatted for entry into their polymorphism/mutation database. In this way, information on each ATP mutation is conveniently accessible to anyone using TAIR's polymorphism/mutation entry tool, which provides links to map and sequence viewers, to ABRC seed stocks, and to ATP. Seeds for TILLING lines are ordered from ABRC using direct links from the TAIR entry. Thus, all TILLING work is performed on M2 populations by ATP, and all growth and analysis of M3 lines are performed by the user. At its current capacity, ATP operates six or seven LI-COR analyzers in (typically) two daily shifts, and the team discovers an average of approximately 40 mutations per day. A user fee of $500 for either the initial screen or for screening the remainder of the collection partially offsets ATP expenses. Nevertheless, most of ATP expenses are currently defrayed by a grant from the NSF Arabidopsis 2010 Project. Incremental technical advances and improvements in efficiency have gradually reduced the cost of TILLING since ATP was established, and by mid-late 2005, it is anticipated that user fees will cover all ATP operating costs. In the first 2 years of operation, ATP delivered approximately 250 allelic series totaling >3,000 sequenced mutations.
Several computer programs have been developed or adapted to facilitate the TILLING process. As described above, CODDLE provides the front end for TILLING (Till et al., 2003
CODDLE was developed by Nicholas Taylor and Elizabeth Greene as a general tool that can also be used for polymorphism analysis and for conveniently designing primers for any organism and any mutagen. Whether for TILLING or for polymorphism analysis, there is a need to assess the effect of missense mutations. We use protein sequence conservation as the basis for evaluating whether a missense mutation is likely to have an effect on the encoded protein. This can be quite effective; for example, the conservation-based SIFT program predicts with approximately 75% accuracy whether or not an amino acid change is damaging to a protein (Ng and Henikoff, 2003
Upon completion of the TILLING process, a report is sent to the user. The PARSESNP (for Project Aligned Related Sequences and Evaluate SNPs; http://www.proweb.org/parsesnp/) program reports map and sequence positions for each result entered in graphical, tabular, and sequence formats (Taylor and Greene, 2003 CODDLE, PARSESNP, and SIFT are general Web-based tools for functional genomics that have been adapted for TILLING. In addition, the TILLING team has implemented a variety of specialized programs for operations, data analysis, billing purposes, and other logistic needs. Although these programs were developed for ATP, they are adapted for other organisms as the need arises.
Dissemination of TILLING technology to benefit plant research has been a major goal of our NSF-funded project. The process is sufficiently complex, both technically and logistically, that we decided to hold two-day workshops so that potential TILLING providers in the academic community can observe the process at firsthand. Workshop attendees, in groups of three to five, observe all steps of the high-throughput TILLING process and obtain current protocols on a collaborative basis. Since the inception of workshops in November 2001, they have become increasingly popular and are now held almost monthly. In 2 years, our TILLING laboratory has hosted a total of 58 researchers from 13 different countries representing 20 different organisms. Several workshop attendees have subsequently established TILLING facilities at their own institutions, including Edwin Cuppen (Hubrecht Institute), Erin Gilchrist (University of British Columbia), and Cliff Weil (Purdue University). Workshops are also attended by researchers who have developed similar facilities independently, such as Charles Dearolf (Massachusetts General Hospital) and Jillian Perry (Sainsbury Institute). We believe that the workshop program is mutually beneficial, eliciting feedback and generating further collaborations while exposing participants to the challenges of a TILLING production operation.
Facile and efficient TILLING depends on the availability of two resources: a well-mutagenized population and genomic information. Chemical mutagenesis is usually simple to carry out and exploit. Well-developed and tested protocols are available for organisms that are genetic models, such as Arabidopsis, maize, the worm (Caenorhabditis elegans), and the fruit fly (Drosophila melanogaster), and standard conditions for forward-genetics studies have been successful for TILLING. Notably, once a satisfactory mutation density has been achieved, the size of the mutant population sufficient for efficient TILLING is relatively small (<10,000; Fig. 1). There is limited information on mutagenesis dosage and mutation yield for crop plants. Anecdotal evidence suggests that the efficiency of mutagenesis varies from species to species, even within Arabidopsis (Henikoff and Comai, 2003
An important consideration is the structure of the mutagenized population library, which can vary considerably from organism to organism. For example, in Arabidopsis, after mutagenesis on M1 seed, we bank and TILL M2 DNA (the progeny of the M1) and bank and distribute M3 seed (the progeny of the M2). This is possible because an individual Arabidopsis plant produces thousands of seeds. However, in species that produce fewer than 100 seeds per individual, the M3 seed might be insufficient for distribution, and an additional generation would be necessary to produce and pool M4 seed from several M3 sibs. Genomic information is useful but not absolutely necessary for TILLING. In theory, once primers have been demonstrated to amplify the target region of a gene, TILLING should be possible. In practice, knowledge of the genome sequence improves the chance of success. For example, it allows in silico examination of mispriming and alternative targets. Polyploidy presents another challenge: If primers designed to amplify one locus in a tetraploid amplify the homeologous gene, pooling is changed as targets from two diploid genome equivalents are amplified per individual instead of one. Furthermore, the two targets might be amplified with different efficiency, further altering the pool composition. The problem can be addressed by determining the sequence of repeated loci and either designing locus-specific primers or adjusting the individual pooling scheme as needed. The considerable groundwork required for each target can delay high-throughput projects in unsequenced polyploid genomes.
Although there has been sufficient demand to keep ATP in continuous operation, Arabidopsis is rich in reverse-genetic resources, and TILLING is expected to be in greater demand where other methods are less applicable. Fortunately, the methodologies that we have developed and the pipeline that we have established for ATP are directly applicable to other organisms, and we and others have extended TILLING to a variety of organisms, especially crop plants. For example, Anawah has several programs for nontransgenic crop development, including for fruits and vegetables, cereals, soy, and peanuts (http://www.anawah.com/programs/). The publicly funded ATP project has expanded to organisms other than Arabidopsis, becoming the Seattle TILLING Project (STP). STP collaborates with workshop attendees who are motivated to establish TILLING but are not prepared to make the substantial investment that is required. Once a mutagenized population is available, a pilot screen is performed, primarily to determine the suitability of a population for TILLING. Variations in mutation rate between organisms, between mutagens, and even between batches of seed or pollen are sufficient to necessitate pilot-scale screening before investing a major effort. Pilot screens also provide for an evaluation of DNA quality and other variables that affect the efficiency of TILLING. As part of our NSF Plant Genome Research Project (PGRP) award, we are able to offer TILLING pilot screening to parties who have potentially suitable mutagenized populations in organisms of interest to PGRP. We find that it is worthwhile to involve STP in the planning of a pilot screen at an early stage, when we can make recommendations based on our experiences with a variety of different organisms. Several pilot projects have been performed in collaboration with workshop attendees who have established mutagenized populations. A pilot project is usually accomplished by preparing, normalizing, and arraying several hundreds of DNA samples from individual plants, ordering primers using CODDLE, and screening in the standard way. Pilot projects have been performed on various mutagenized populations of rice, maize, soybeans (Glycine max), and Chlamydomonas with NSF PGRP support.
Plants are well suited for TILLING because seeds can be stored for long periods of time, allowing screening to be performed on the same mutant population indefinitely. Animals are also suitable for TILLING if there is an efficient strategy for germ plasm recovery. In two instances, this has been accomplished by saving the live progeny of screened individuals. Dearolf and co-workers used DHPLC to screen EMS-mutagenized Drosophila, obtaining an allelic series for the awd gene (Bentley et al., 2000
The use of live progeny rather than germ plasm storage means that screening should be completed within a single generation, and both studies were limited to single genes. Recently, however, Cuppen and co-workers used a modification of the STP method to obtain allelic series from 16 zebrafish genes within a single generation (Wienholds et al., 2003b
Other solutions to the germ plasm storage problem have been applied to animal TILLING. Bruce Draper, Cecilia Moens, and colleagues at Anawah have TILLed ethylnitrosourea-treated zebrafish using frozen sperm for germ plasm recovery (Draper et al., 2004
In addition to allowing efficient detection of mutations, high-throughput TILLING technology is ideal for the detection of natural polymorphisms: CEL I cuts with partial efficiency, allowing the display of multiple mismatches in a DNA duplex. Therefore, interrogating an unknown homologous DNA by heteroduplexing to a known sequence reveals the number and position of polymorphic sites. Both nucleotide changes and small insertions and deletions are identified, including at least some repeat number polymorphisms. We call this method Ecotilling (Comai et al., 2004 Each SNP is recorded by its approximate position within a few nucleotides. Thus, each haplotype can be archived based on its mobility (Fig. 2). Sequence data can be obtained with a relatively small incremental effort using aliquots of the same amplified DNA that is used for the mismatch-cleavage assay. The left or right sequencing primer for a single reaction is chosen by its proximity to the polymorphism. Sequencher software performs a multiple alignment and discovers the base change, which in each case confirmed the gel band.
Ecotilling can be performed more cheaply than full sequencing, the method currently used for most SNP discovery. We simply screen plates containing arrayed ecotypic DNA rather than pools of DNA from mutagenized plants. Because detection is on gels with nearly base pair resolution and background patterns are uniform across lanes, bands that are of identical size can be matched, thus discovering and genotyping SNPs in a single step. In this way, ultimate sequencing of the SNP is simple and efficient, made more so by the fact that the aliquots of the same PCR products used for screening can be subjected to DNA sequencing.
The need for allelic series of mutations for functional studies is not likely to abate in the near future, and the increasing availability of genomic sequence will further fuel demand. Therefore, we expect that our high-throughput TILLING method, or something like it, will become increasingly popular, especially for agriculture, where there is so much useful knowledge to be gained from functional genomics and where nontransgenic methods are especially desirable. Our ability to screen for point mutations on a production scale means that other steps in the process become limiting. Achieving high and consistent levels of mutagenesis while maintaining viability and fertility is a major challenge, especially for rice, where we continue to encounter difficulties in obtaining a suitably mutagenized population. Another challenge is what takes place after an allelic series is delivered: High-throughput TILLING discovers so many mutations that it sometimes can be a major effort for a user to adequately perform the necessary phenotypic analysis and genotyping.
TILLING depends upon the ability to detect mismatches in DNA heteroduplexes, but competition is intense to develop other ways to discover and screen for single-nucleotide differences. For the long term at least, it is probably impossible to predict what technologies will prevail (Henikoff and Comai, 2003 Received February 15, 2004; returned for revision March 9, 2004; accepted March 9, 2004.
www.plantphysiol.org/cgi/doi/10.1104/pp.104.041061. * Corresponding author; e-mail steveh{at}fhcrc.org; fax 2066675889.
Alonso JM, Stepanova AN, Leisse TJ, Kim CJ, Chen H, Shinn P, Stevenson DK, Zimmerman J, Barajas P, Cheuk R, et al (2003) Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science 301: 653657 Ashburner M (1990) Drosophila, A Laboratory Handbook. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
Bentley A, MacLennan B, Calvo J, Dearolf CR (2000) Targeted recovery of mutations in Drosophila. Genetics 156: 11691173
Chuang CF, Meyerowitz EM (2000) Specific and heritable genetic interference by double-stranded RNA in Arabidopsis thaliana. Proc Natl Acad Sci USA 97: 49854990
Colbert T, Till BJ, Tompa R, Reynolds S, Steine MN, Yeung AT, McCallum CM, Comai L, Henikoff S (2001) High-throughput screening for induced point mutations. Plant Physiol 126: 480484 Comai L, Young K, Till BJ, Reynolds SH, Greene EA, Codomo CA, Enns LC, Johnson J, Burtner C, Oden AR, et al (2004) Efficient discovery of DNA polymorphisms in natural populations by Ecotilling. Plant J 37: 778786[CrossRef][ISI][Medline] Draper BW, McCallum CM, Stout JL, Slade AJ, Moens CB (2004) A high-throughput method for identifying ENU-induced point mutations in zebrafish. Methods Cell Biol (in press)
Greene EA, Codomo CA, Taylor NE, Henikoff JG, Till BJ, Reynolds SH, Enns LC, Burtner C, Johnson JE, Odden AR, et al (2003) Spectrum of chemically induced mutations from a large-scale reverse-genetic screen in Arabidopsis. Genetics 164: 731740 Henikoff JG, Greene EA, Taylor N, Pietrokovski S, Henikoff S (2002) Using the Blocks database to recognize functional domains. In AD Baxevanis, D Davison, R Hogue, G Page, GD Stormo, L Stein, eds, Current Protocols in Bioinformatics. John Wiley & Sons, New York, NY Henikoff S, Comai L (2003) Single-nucleotide mutations for plant functional genomics. Annu Rev Plant Physiol Plant Mol Biol 54: 375401[CrossRef][Medline] Hurlstone AF, Haramis AP, Wienholds E, Begthel H, Korving J, van Eeden FJ, Cuppen E, Zivkovic D, Plasterk RH, Clevers H (2003) The Wnt/beta-catenin pathway regulates cardiac valve formation. Nature 425: 633637[CrossRef][Medline] Jackson AL, Bartz SR, Schelter J, Kobayashi SV, Burchard J, Mao M, Li B, Cavet G, Linsley PS (2003) Expression profiling reveals off-target gene regulation by RNAi. Nat Biotechnol 21: 635637[CrossRef][ISI][Medline] McCallum CM, Comai L, Greene EA, Henikoff S (2000) Targeted screening for induced mutations. Nat Biotechnol 18: 455457[CrossRef][ISI][Medline] Middendorf LR, Bruce JC, Bruce RC, Eckles RD, Grone DL, Roemer SC, Sloniker GD, Steffens DL, Sutter SL, Brumbaugh JA (1992) Continuous, on-line DNA sequencing using a versatile infrared laser scanner/electrophoresis apparatus. Electrophoresis 13: 487494[CrossRef][ISI][Medline] Nagy A, Perrimon N, Sandmeyer S, Plasterk R (2003) Tailoring the genome: the power of genetic approaches. Nat Genet 33 (suppl.): 276284
Ng PC, Henikoff S (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31: 38123814
Oleykowski CA, Bronson Mullins CR, Godwin AK, Yeung AT (1998) Mutation detection using a novel plant endonuclease. Nucleic Acids Res 26: 45974602 Smits BMG, Mudde J, Plasterk RHA, Cuppen E (2004) Target-selected mutagenesis of the rat. Genomics 83: 332334[CrossRef][ISI][Medline]
Taylor N, Greene EA (2003) PARSESNP: A tool for the analysis of nucleotide polymorphisms. Nucleic Acids Res 31: 38083811
Till BJ, Reynolds SH, Greene EA, Codomo CA, Enns LC, Johnson JE, Burtner C, Odden AR, Young K, Taylor NE, et al (2003) Large-scale discovery of induced point mutations with high-throughput TILLING. Genome Res 13: 524530
Waterhouse PM, Graham MW, Wang MB (1998) Virus resistance and gene silencing in plants can be induced by simultaneous expression of sense and antisense RNA. Proc Natl Acad Sci USA 95: 1395913964 Wienholds E, Koudijs MJ, van Eeden FJ, Cuppen E, Plasterk RH (2003a) The microRNA-producing enzyme Dicer1 is essential for zebrafish development. Nat Genet 35: 217218[CrossRef][ISI][Medline]
Wienholds E, Schulte-Merker S, Walderich B, Plasterk RH (2002) Target-selected inactivation of the zebrafish rag1 gene. Science 297: 99102
Wienholds E, van Eeden FJ, Kosters M, Mudde J, Plasterk RH, Cuppen E (2003b) Efficient target-selected mutagenesis in zebrafish. Genome Res 13: 27002707 This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||