A sequence-ready physical map of barley anchored genetically by two million single-nucleotide polymorphisms.

Barley (Hordeum vulgare) is an important cereal crop and a model species for Triticeae genomics. To lay the foundation for hierarchical map-based sequencing, a genome-wide physical map of its large and complex 5.1 billion-bp genome was constructed by high-information content fingerprinting of almost 600,000 bacterial artificial chromosomes representing 14-fold haploid genome coverage. The resultant physical map comprises 9,265 contigs with a cumulative size of 4.9 Gb representing 96% of the physical length of the barley genome. The reliability of the map was verified through extensive genetic marker information and the analysis of topological networks of clone overlaps. A minimum tiling path of 66,772 minimally overlapping clones was defined that will serve as a template for hierarchical clone-by-clone map-based shotgun sequencing. We integrated whole-genome shotgun sequence data from the individuals of two mapping populations with published bacterial artificial chromosome survey sequence information to genetically anchor the physical map. This novel approach in combination with the comprehensive whole-genome shotgun sequence data sets allowed us to independently validate and improve a previously reported physical and genetic framework. The resources developed in this study will underpin fine-mapping and cloning of agronomically important genes and the assembly of a draft genome sequence.

Barley (Hordeum vulgare) is an important source of human and animal nutrition and underpins the malting and brewing industries. It is among the earliest domesticated crop plants and is adapted to a wide range of environmental conditions. A wealth of genomic resources such as dense genetic maps, ESTs, complementary DNA libraries, and an atlas of gene expression have been developed in the past two decades (for review, see Schulte et al., 2009). Moreover, comprehensive germplasm collections from cultivars and wild accessions (van Hintum and Menting, 2003) as well as extensive mutant collections  provide a sound foundation for genetic studies into developmental and morphological processes. However, the full exploitation of these resources for basic research and crop improvement has been hampered by the lack of a reference genome sequence. The extent to which genomic research in a crop species may be spurred by the availability of a reference genome is aptly illustrated by the large number of agronomically important rice (Oryza sativa) genes that have been successfully cloned since the release of the rice genome sequence (for review, see Huang et al., 2013).
Before the advent of next-generation sequencing (NGS) technology, the scale of a barley genome project seemed daunting, owing to the large size (5.1 Gb) and the high repeat content. After a combination of sequencing of a small number of bacterial artificial chromosomes (BACs; Wicker et al., 2006;Steuernagel et al., 2009;Taudien et al., 2011) and shallow whole-genome sequencing (Wicker et al., 2008) demonstrated the utility of NGS for assembling a large and complex genome, the prospects of obtaining a high-utility sequence of the barley genome were considerably enhanced. In addition, the development of innovative sequence assembly algorithms made it possible to obtain robust wholegenome shotgun assemblies of mammalian genomes using only paired-end NGS sequencing reads of different insert sizes (Gnerre et al., 2011). Using these analytical approaches and libraries with various insert sizes, whole-genome shotgun assemblies of the bread wheat (Triticum aestivum) progenitors Triticum urartu (Ling et al., 2013) and Aegilops tauschii (Jia et al., 2013) have been published recently.
A major challenge in applying whole-genome shotgun sequencing to large and complex plant genomes is their highly repetitive structure. In barley, stretches of genomic sequence as large as several hundred kilobases may be entirely composed of nested transposable elements (Wicker et al., 2005). Further complications arise from highly similar families of paralogous genes, the remnants of recent or ancient polyploidization events (Levy and Feldman, 2002) or pseudogene formation (Wicker et al., 2011). As a result of these difficulties, socalled gene-space assemblies from whole-genome sequencing data have been widely adopted as an enabling alternative for many research applications, particularly when the majority of the gene complement has been successfully embedded into a broader genomic context. Such a milestone was recently completed for barley with the release of a gene-space assembly embedded in a sequence-enriched physical and genetic framework (The International Barley Genome Sequencing Consortium, 2012). Half of the physical contigs were genetically positioned through integration of a genome-wide physical map, with sequence data derived from BAC end sequencing, sequencing of complete BAC clones, whole-genome shotgun contigs, and shotgun sequencing of sorted chromosome arms. The integration of transcriptome sequence into this framework defined and provided context and definition for 26,159 high-confidence gene models.
Despite this achievement, the International Barley Genome Sequencing Consortium has continued to promote a map-based sequencing strategy encompassing restriction-based fingerprinting of deep-coverage BAC libraries, an approach that has had successful precedents in Arabidopsis (Arabidopsis thaliana), rice, and maize (Zea mays) and several animal genomes (Arabidopsis Genome Initiative, 2000;Lander et al., 2001;Waterston et al., 2002;International Rice Genome Sequencing Project, 2005;Schnable et al., 2009). Mapbased or hierarchical sequencing approaches reduce the complexity of the sequence assembly process by partitioning the genome into smaller pieces. Although this involves the time-consuming and laborious steps of BAC library construction, restriction endonuclease fingerprinting, BAC contig assembly, and subsequent clone-by-clone sequencing, it currently remains indispensable in plant genome projects that have the goal of constructing contiguous pseudomolecules (Feuillet et al., 2011(Feuillet et al., , 2012. We recently developed a method termed POPSEQ (for population sequencing) for linearly ordering wholegenome shotgun sequence (WGS) assemblies in the absence of highly developed genomic resources. As a proof of concept, shallow sequencing two small biparental mapping populations was shown to greatly improve the genetic anchoring of the International Barley Genome Sequencing Consortium whole-genome shotgun assembly of the barley cv Morex (Mascher et al., 2013). While we demonstrated that POPSEQ anchoring of a whole-genome shotgun assembly is independent of physical map construction, we also argued that the data obtained should expedite the ongoing physical mapping project. In particular, the high marker density afforded by whole-genome sequencing should enable direct anchoring of the majority of fully sequenced BAC clones. Here, we extend and improve the previously established physical and genetic framework of barley (The International Barley Genome Sequencing Consortium, 2012). We describe the process of physical map construction and report the sequence-ready minimum tiling path (MTP) of overlapping clones that will be instrumental in producing a draft genome sequence. We illustrate the usefulness of our resource for map-based cloning and introduce POPSEQ as a new approach for genetic anchoring of physical maps.

Assembly of the Physical Map
Six large-insert BAC libraries of barley cv Morex had been constructed and characterized previously (Yu et al., 2000;Schulte et al., 2011). Additionally, genecontaining BACs had been identified in the library HVVMRXALLhA and rearrayed for fingerprinting (The International Barley Genome Sequencing Consortium, 2012). We analyzed 690,912 barley BAC clones from these six libraries by high-information content fingerprinting (HICF; Luo et al., 2003; Table I). Fingerprint profiles were checked for plate-wide, neighboring, or chloroplast contamination as well as for clones containing more than 250 or less than 30 fragments. A total of 571,007 (82.6%) clones representing approximately 14-fold haploid genome coverage passed our quality filters and were imported into the FingerPrint Contig (FPC) assembly program (Soderlund et al., 2000). For the initial assembly, we used very stringent parameters to avoid misassemblies due to highly similar repetitive regions present at high number in the barley genome and to construct a robust framework that was to be refined in further steps. This preliminary assembly organized 481,158 clones into 18,570 fingerprinted contigs (FP contigs), leaving 89,849 clones as singletons.
Subsequently, further automatic assembly iterations were performed with a step-wise reduction of stringency (Supplemental Fig. S1), reducing the number of contigs to 9,436 and incorporating 37,050 additional clones. Manual editing of this contig set identified 171 misassembled contigs, each harboring two groups of markers from different genetic positions. These contigs were split up and rebuilt. Less stringent overlap criteria (Sulston score 1e-25) supported by genetic marker data allowed us to manually merge 130 contigs. Finally, 313 contigs composed only of two equivalent clones from BAC library HVVMRXALLhA and its rearrayed correspondent HVVMRX83KhA were considered as singletons. Manual editing resulted in a final build of 9,265 contigs with an average size of 538 kb (Table II). The assembly consists of approximately 4.0 million unique consensus bands and is estimated to cover 4.9 Gb (96%) of the 5.1-Gb genome of barley. In relation to genome size, the number of contigs in the final barley map is proportional to metrics of the rice physical map (1,019 contigs, approximately 450-Mb genome; Chen et al., 2002) and to the physical map of the 1-Gb bread wheat chromosome 3B (1,036 contigs; Paux et al., 2008).

Validation of the Physical Map
In addition to cross-checking contigs with genetic marker information during manual editing, the reliability of the map was corroborated by using an alternative program for the analysis of HICF data. The linear topology contigs (LTC) tool (Frenkel et al., 2010) had been specifically designed for the construction and verification of physical maps of complex genomes. In particular, LTC is able to detect misassembled contigs, where the net of clone overlaps has branched or cyclical topological structures contradicting the linearity of the chromosomes are observed. LTC analysis confirmed that about 90% of contigs constructed by FPC consisted only of reliably overlapping clones in a linear order (Fig. 1). The majority of the remaining 10% of putatively branched contigs contained only one pair of clones with unreliable overlap. Such contigs were considered as less problematic (hence retained) because single unreliable clone overlap can result from low-quality fingerprinting.
We checked the distribution of clones from different libraries across the genome (Fig. 2). Clones from the gene-enriched library HVVMRX83KhA were preferentially located in the distal part of the chromosomes, which have a higher gene density compared with pericentromeric regions (The International Barley Genome Sequencing Consortium, 2012). The other libraries showed a fairly equal distribution along the chromosomes.

Establishment of an MTP
A MTP is a series of clones that covers the genome with a predefined minimal overlap between pairs of adjacent clones. MTPs are constructed to minimize the number of BAC clones that have to be sequenced for complete coverage of the genome. Estimating the exact order of contigs and the overlap between pairs of adjacent contigs only using fingerprinting profiles is Figure 2. Normalized distribution of clones from different libraries along chromosomes. The distribution of clones from different libraries was plotted along the seven barley chromosomes. The counts of clones from different libraries were normalized by calculating the ratio between observed and expected clone number in a sliding window. Contigs toward telomeres were enriched for clones of library HVVMRX83khA, containing BAC clones identified for the presence of genes (Madishetty et al., 2007). computationally challenging. We applied the LTC tool to pick an MTP for each of the 9,265 BAC contigs. After recalculating the global order of clones in each contig, LTC selected an MTP consisting of 68,047 clones. Using the genetic anchoring of contigs, the MTP BACs were rearrayed into 178 microtiter plates, grouping together clones from the same chromosome and separating all clones of unanchored contigs. DNA from the clones for each chromosome was combined into plate, row, and column pools to facilitate future screening. Individual MTP clones as well as BAC pools can be obtained from the French Plant Genomic Resource Centre (CNRGV; http://cnrgv.toulouse.inra.fr/en/library/barley). A list of all MTP clones is available as Supplemental Table S1.

Anchoring the Physical Map by POPSEQ
Previously, we had anchored 4,556 physical contigs (3.9 Gb) to genetic positions through approximately 3,000 single-nucleotide polymorphism (SNP) markers and approximately 500,000 genotyping-by-sequencing markers (The International Barley Genome Sequencing Consortium, 2012). In addition, 1,881 contigs could be assigned to chromosome arms by using sequence data from flow-sorted chromosome arms (The International Barley Genome Sequencing Consortium, 2012). Most contigs without genetic or chromosomal positions were either short or lacked sequence or marker information. We recently introduced the POPSEQ approach to genetically anchor highly fragmented sequence assemblies by whole-genome sequencing of individuals of a segregating population (Mascher et al., 2013). The same methodology may be used to place BAC contigs or single BAC clones into a genetic framework. For this purpose, we first projected the genetically anchored WGS contigs onto the physical map. The WGS assembly is a necessary intermediary, as the sequence information attached to the physical map is incomplete and consequently cannot serve as an appropriate reference for short-read alignment and SNP-calling algorithms. We had anchored WGS contigs of barley cv Morex using two different mapping populations (Morex 3 Barke and the Oregon Wolfe Barleys [OWB]). The iSelect framework (Comadran et al., 2012) with 1,690 genetic bins and a genotyping-by-sequencing map with 983 bins were used for Morex 3 Barke and OWB, respectively. The iSelect framework was also used in the previous effort to anchor the physical map of barley (The International Barley Genome Sequencing Consortium, 2012).
By stringent homology searches against fully sequenced BACs and BAC end sequences requiring at least 99.5% sequence identity and a minimum alignment length of 500 bp, we assigned 82,381 WGS contigs to 5,872 BAC contigs (72% of BAC contigs with associated sequence information). The genetic position of a physical contig was then set to the median genetic position of all POPSEQ-anchored WGS contigs assigned to it. A total of 4,920 and 5,002 BAC contigs could be anchored to the Morex 3 Barke and OWB maps, respectively. In both cases, three-quarters of contig positions were supported by at least two WGS contigs. Out of 4,411 BAC contigs anchored to both maps, 92.8% were positioned no farther than 5 centimorgan (cM) apart on the respective maps (Fig. 3A). The proportion of contigs anchored within 1 or 2 cM was 56.4% or 74.3%, respectively. A similar degree of agreement between different maps has already been reported for the anchoring of WGS contigs (The International Barley Genome Sequencing Consortium, 2012). This outcome is the result of the different resolutions of the underlying genetic maps as well as the procedures used for integration. Merging the anchoring results from both maps, we obtained a set of 5,193 anchored contigs (Table III). The number of anchored contigs varied considerably between distal and pericentromeric regions (Fig. 4). In distal regions, the ratio of physical to genetic distance was 1 to 10 Mb per cM, while it was 100 to 500 Mb in pericentromeric regions. Of all contigs anchored to the OWB or Morex 3 Barke framework, 3,830 (73.8%) with a cumulative length of 3.5 Gb are also anchored to the published physical and genetic framework (The International Barley Genome Sequencing Consortium, 2012). Chromosomal assignments between both maps agree in 97.6% of the cases, and cM coordinates disagreed in only 8.6% of cases (Fig. 3B). Similar to the anchoring of WGS contigs (Mascher et al., 2013), discordant contig placements mostly occurred in the genetic centromere. Although the POPSEQ anchoring contains 14% more contigs than the published physical and genetic framework (The International Barley Genome Sequencing Consortium, 2012), the cumulative length of all anchored contigs increases by only 1.3%. The high number of markers enabled us both to include shorter contigs (mean contig size of 761 versus 856 kb) and to exclude some longer contigs with inconsistent marker information. Furthermore, we applied more stringent alignment (500-bp minimum alignment length and 99.5% or greater identity) criteria compared with our previous effort ( Led by the observation that POPSEQ is able to anchor shorter contigs, we attempted to anchor single, fullysequenced BAC clones. Instead of aggregating anchoring information per physical contig, we averaged genetic positions at a per-BAC level. A total of 6,243 (99.4%) of all sequenced BACs harbored WGS contigs, and 5,591 (89.1%) could be anchored to the Morex 3 Barke or OWB framework (Table III). The genetic positions of BACs and their corresponding FP contigs agreed in 97.6% of cases. As the number of discordant chromosome assignments was three times higher than the number of discordant cM positions, disagreement between both anchoring methods arises most likely from single wrongly placed clones that are located on different chromosomes from their assigned physical contig. We found pairs of BACs on 71 FP contigs that were anchored to different chromosomes. For this analysis, BACs were required to harbor at least two WGS contigs that were consistently anchored in both the Morex 3 Barke and OWB frameworks. Among the anchored BACs, there were also 278 singleton clones that could now be assigned to chromosomal locations to guide their assignment to contigs based on sequence similarity.

DISCUSSION
A sequence-ready genome-wide physical map of the barley genome has been constructed. More than half a million clones representing 14-fold genome coverage have been fingerprinted and assembled into physical contigs. The MTP we have established will provide the framework for clone-by-clone sequencing of the barley genome. Restricting attention to single BAC clones reduces the algorithmic complexity of sequence assembly, thus enabling the use of short NGS reads and established assembly programs that would result in highly fragmented assemblies when applied on a wholegenome scale. The International Barley Genome Sequencing Consortium (2012) has already sequenced Genetic anchoring of the barley physical map had been reported earlier (The International Barley Genome Sequencing Consortium, 2012). However, chromosomal locations could be assigned to only about half of all physical contigs through the integration of marker sequences from various genetic maps with BAC contigs. In this study, we applied a new approach for integrating the physical map with genetic maps by extending the POPSEQ method originally developed (Mascher et al., 2013) for fragmented whole-genome assemblies to physical contig assemblies. This novel straightforward approach was able to anchor more contigs than the complex, multilayered strategy presented earlier. The high number of informative markers obtained by highthroughput sequencing enabled genetic anchoring at a clone-by-clone scale and thereby provided an independent method of validating the physical map. This method could also complement other genome projects following a hierarchical shotgun strategy (Paux et al., 2008;Lin et al., 2010;Dohm et al., 2012). At present, POPSEQ anchoring of the barley physical map is limited by the paucity of high-quality sequence information from each individual BAC contig. Although approximately 300,000 BACs have been end sequenced, these sequences are shorter (less than 1,000 bp) than assembled sequence contigs and mostly originate from repetitive regions, because they are distributed randomly across the genome. As the physical length of all contigs anchored by POPSEQ amounts to 90% of the physical length of all contigs with associated WGS contigs, we anticipate that a substantial increase of anchoring efficiency can only be achieved when more BAC sequence information becomes available. The full power of POPSEQ to genetically anchor the physical map of barley will only be apparent when a completely sequenced MTP is available. Our preliminary analysis of approximately 6,200 fully sequenced clones showed that we can reasonably expect the vast majority of clones to harbor an anchored WGS contig. Alternatively, the sequenced MTP clones may serve as a reference for read mapping after removal of the sequence redundancy introduced by overlapping clones. SNP calling and genotyping could then be performed on the sequence scaffolds of the physical contigs, which would then be directly anchored to a genetic map without the intermediate step of a WGS assembly. The genetic anchoring of individual clones will enable us to further validate contig integrity, identify erroneously placed clones, and position singleton clones.
Apart from its value in assembling a draft genome sequence, the barley physical map presented here will assist genomic research by accelerating the isolation of genes underlying phenotypic traits. As an illustration, we explored the utility of the resource by determining the physical positions of 14 barley genes that had been isolated through positional cloning or genome-wide association studies (Fig. 5). All genes could be associated with contigs on the physical map, which were in most cases longer than the contigs of local restriction maps originally reported. For example, three BACs in the vicinity of Photoperiod-H1, a major regulator of photoperiod response, had been identified through BAC library screening (Turner et al., 2005). Photoperiod-H1 is annotated as a high-confidence gene on a whole-genome shotgun contig (morex_contig_94710). This contig has high sequence similarity (more than 99.9% over 5,000 bp) to the sequenced BAC HVVMRXALLhA0598A09, which is part of the physical map contig FP_contig_2992. The physical map also provides extensive local information for genes that were not found through positional cloning. For example, Required for mlo-specified resistance2, a gene involved in resistance to powdery mildew, was identified through a synteny-based approach, and Intermedium-c, a gene that modifies lateral spikelet fertility, was identified through association mapping and analysis of conserved gene order. Both genes were positioned on genetically anchored physical contigs that provide extended information about genomic context and local neighborhood. Therefore, the physical map of barley and its associated sequence and clone resources will serve as a hub for marker development and candidate gene identification in ongoing map-based cloning efforts.
The majority of barley genes cloned to date are located in the distal regions of the chromosomes, where the ratio of physical and genetic distances is tractable. In telomeric regions of barley, the ratio of genetic to physical distance is, on average, larger than 0.5 cM per Mb (Fig. 4). As half of the physical map is contained in contigs larger than 904 kb (Table II), finding markers flanking a target locus on both sides and located on a single or neighboring physical contig seems within reach for large mapping populations. In this situation, the genome-wide physical map of barley would obviate the need for chromosome walking. Instead, BAC contigs harboring flanking or cosegregating markers can be identified either by library screening or searching the sequence resources integrated with the physical map. Subsequently, the MTP of physical contigs can be sequenced and the sequencing data mined for candidate genes or additional markers.
One shortcoming of the current sequence-enriched framework is the lack of resolution in pericentromeric regions. Both positional cloning and genetic anchoring of the physical map are hampered by the severely reduced recombination frequency in the genetic centromere. It is not uncommon, even in large mapping populations, that closely flanking markers of a target gene residing in the genetic centromere are located on opposite chromosome arms (Shahinnia et al., 2012;Okagaki et al., 2013). Several hundred megabases (encompassing several dozen BAC contigs) may correspond to a genetic interval of less than 1 cM, and the ordering of physical contigs with respect to each other is lacking for these regions.
The limited resolution of our map in centromeric regions may be improved through populations that provide higher mapping resolution (e.g. a large number [more than 1,000] of recombinant inbred lines). Genome-wide high-density genotyping of several hundred or even thousands of individuals has been made possible by cost-effective genotyping by sequencing (Elshire at al., 2011;Poland et al., 2012). Apart from recombination-based mapping, radiation hybrid panels similar to those implemented in wheat (Kalavacharla et al., 2006) or obtained for barley by the activity of gametocidal chromosomes (Masoudi-Nejad et al., 2005) may be of value. Similarly, fluorescent in situ hybridization mapping (Cheng et al., 2001) and optical mapping (Zhou et al., 2009) hold some promise and may be used to order markers in recombinogenically inert regions. More importantly, once the MTP has been sequenced, overlapping adjacent physical contigs can be merged using sequence information and result in an improved linear order in the same genetic bin.
In summary, the genome-wide physical map of barley constitutes the backbone for map-based sequencing of the genome and facilitates high-resolution trait mapping and gene isolation. We expanded the POPSEQ method to anchoring of physical contigs and BAC clones. This can be adopted by current and future physical mapping projects of other large and complex genomes.

HICF and Automatic Contig Assembly
Six BAC libraries were used in this study (Yu et al., 2000;Schulte et al., 2011). These had been either constructed by partial enzymatic fragmentation of high-M r DNA with the restriction endonucleases HindIII, EcoRI, or MboI or after mechanical fragmentation and blunt-end ligation. Plasmid DNA was isolated from a total of 690,912 BAC clones as described previously  and was subjected to HICF according to published procedures (Luo et al., 2003). Peak areas, peak heights, and fragment sizes of each BAC fingerprint profile were collected by the Applied Biosystems 3730xl data collection program. The sizing quality of raw fingerprinting data was assessed with GeneMapper version 4.0 software (Applied Biosystems), and fingerprint profiles were further edited with the tool FPminer as described previously . Clones were tagged as neighbor or plate-wide contaminated and excluded from further analysis if the overall fragment identity of two clones at neighboring positions within one plate or at an identical position in subsequent plates of the library was higher than 50%. Chloroplast DNA contamination was determined by comparing the fingerprint profiles with a BAC clone containing the chloroplast genome (Saski et al., 2007) of barley (Hordeum vulgare) cv Morex. Clones sharing more than 50% of their fragments with this BAC were excluded from further analysis. Furthermore, all clones with less than 30 or more than 250 fragments were discarded in order to improve the contig assembly . Fingerprinting profiles were assembled into contigs using the fingerprint assembly software tool FPC version 9.2 (Soderlund et al., 2000). The initial contig build was performed with a Sulston score threshold of 1e-90, a tolerance of 5, and the questionable clones (Q clones) parameter (Q clones are clones where the Contig Built algorithm of FPC cannot order at least 50% of the bands in the Contig Built map) was set to 10% (i.e. 10% of clones in a contig were tolerated to be questionable).
To incorporate further singleton clones into contigs, "single-to-end" and "end-to-end" merging (Match: 2; FromEnd: 55) was performed at nine successively reduced cutoffs (Sulston score threshold of 1e-85 to 1e-45) as described previously (Paux et al., 2008). Contigs with more than 10% Q clones (Q clones where FPC cannot order more than half of the bands) were reanalyzed with the DQer function of FPC as described previously (Soderlund et al., 2000;Paux et al., 2008).

Manual Editing
Physical contigs were genetically positioned or assigned to a chromosome arm using experimental as well as bioinformatics procedures as described previously (The International Barley Genome Sequencing Consortium, 2012). All contigs with genetic marker information or chromosome arm assignment were examined for marker distribution along the entire length of the contig and consistent position assignments among the diverse anchoring information. Contigs that contained at least one contradicting marker were verified manually by rebuilding the contig with increased stringency (Sulston score threshold of 1e-70). Putative chimeric contigs with clusters of markers from different chromosomes were manually disrupted at the appropriate site for marker consistency, and contig integrity was established by rebuilding the contigs (Soderlund et al., 2000). Final contig merging was performed with a Sulston score cutoff of 1e-25. At this stage, two contigs were merged only if a minimum of two BAC clones at the end of each contig matched in a reciprocal and unique manner and shared marker information with a maximum genetic distance of 5 cM. Finally, all contigs composed of only two BAC clones were examined to identify those consisting of equivalent clones from the library HVVMRXALLhA (Yu et al., 2000) and its rearrayed subset of gene-containing clones (Madishetty et al., 2007;named HVVMRX83KhA after rearraying). By comparing the average fragment number per clone in the final map with the average insert size determined by pulsed field gel electrophoresis or BAC sequencing (The International Barley Genome Sequencing Consortium, 2012), we determined the average size of an FPC consensus band to be 1.24 kb. Using this conversion factor, the cumulative length of all physical contigs was calculated.

Contig Validation and MTP Establishment
BAC contigs generated by FPC were further analyzed with the alternative physical mapping tool LTC (Frenkel et al., 2010) to trace misassembled contigs caused by chimerical clones and false clone overlaps. Misassemblies were detected by analysis of the topological structure of the net of clone overlaps within contigs (using a Sulston score cutoff of 1e-20): a contig was considered as putatively problematic if this net had a nonlinear structure. To check the linearity of the net, LTC automatically selected one of the possible diametric paths (the longest nonreducible path going through edges corresponding to significantly overlapping clones, with vertices corresponding to clones) and scored the ranks of all vertices relative to the vertices from this diametric path (vertices from the diametric path were ranked as 0, vertices connected to vertices from the diametric path by a single edge were ranked as 1, etc.). The presence of vertices having a rank of 2 and more points indicated the nonlinearity of the net (Frenkel et al., 2010). In the case of nonlinearity of the net structure, LTC automatically tries to repeat the analysis using a more stringent cutoff (with increasing cutoff stringency by 1 order of magnitude at each step). In many cases, this increasing stringency resulted in the exclusion of false overlaps, and the net of remaining clone overlaps was found to have a linear structure. To identify the presence of putative chimerical clones (composed by artificial fusion of DNA from distant genomic regions), the parallel clone overlaps were automatically searched for each clone of the contig (Frenkel et al., 2010). LTC was also used for the recalculation of clone end coordinates within FPC-based contigs (algorithms implemented in LTC enable reducing errors in the estimation of clone position and contig length caused by local optimization in FPC). The MTP was automatically selected by LTC. Following the application of the LTC algorithms, each contig in the selected MTP was based on clones corresponding to vertices from the diametric path used for contig testing (see above). In this manner, the MTP selection resulted in the identification of significant overlaps between neighboring MTP clones. For the selected cutoff stringency, each pair of neighbor MTP clones has overlap of about 35% of the clone length. The MTP clones were rearrayed into microtiter plates and grouped by chromosomes. Threedimensional BAC DNA pools were established and validated for all chromosomes. The MTP clones and the three-dimensional BAC pools can be ordered from CNRGV (http://cnrgv.toulouse.inra.fr/en/library/barley).

Distribution of BAC Libraries
The expected number of clones from a library L per sliding window B on a chromosome C were calculated using the formula e = n(L,C) 3 (l(B)/l(L,C)), where l(L,C) is the combined length of all clones of library L on chromosome C, l(B) is the combined length of clones anchored to B, and n(L,C) is the number of all clones of library L that are contained in contigs anchored to chromosome C. The length of clones was derived from the number of FPC consensus bands. The normalized count of clones from a fixed library per genetic bin was calculated by dividing the observed number of clones by the expected number of clones. The size of a sliding window was chosen to include 30 adjacent positions in the genetic map.

POPSEQ Anchoring of Physical Contigs and BAC Clones
We had previously anchored the whole-genome shotgun assembly of barley cv Morex (Mascher et al., 2013). Sequences of anchored WGS contigs were compared with fully sequenced BACs and BAC end sequences of barley (The International Barley Genome Sequencing Consortium, 2012) by megablast (Zhang et al., 2000) and thus assigned to physical contigs. Similar to aggregating anchoring information of single SNPs on WGS contigs (Mascher et al., 2013), the genetic position of a physical contig or of a single clone was defined as the median genetic position of all WGS contigs assigned to it. We required that at least 80% of all WGS contigs assigned to an FP contig or BAC clone lie on the same chromosome and the median absolute deviation of their cM positions be less than 5 cM. Genetic positions in the OWB map were interpolated to the Morex 3 Barke iSelect map as described previously (Mascher et al., 2013). When combining coordinates, we required consistency between both maps (i.e. both genetic positions had to be no farther apart from each other than 5 cM). If contigs or clones were anchored to both framework maps, the Morex 3 Barke position was preferred.

Supplemental Data
The following materials are available in the online version of this article.