Genetic analysis of strawberry fruit aroma and identification of O-methyltransferase FaOMT as the locus controlling natural variation in mesifurane content 1

Improvement of strawberry ( is an important breeding programs. To investigate genetic factors controlling this complex trait, a strawberry mapping population derived from genotype ‘1392’, selected for its superior flavor, and ‘232’ was profiled for volatile compounds over 4 years by HS-SPME-GC-MS. More than 300 volatile compounds were detected, of which, 87 were identified by comparison of mass spectrum and retention time to those of pure standards. Parental line ‘1392’ displayed higher volatile levels than ‘232’, and these and many other compounds with similar levels in both parents segregated in the progeny. Cluster analysis grouped the volatiles into distinct chemically related families and revealed a complex metabolic network underlying volatile production in strawberry fruit. QTL detection was carried out over 3 years based on a double pseudo-testcross strategy. Seventy QTLs covering 48 different volatiles were detected, with several of them being stable over time and mapped as major QTLs. Loci controlling γ decalactone and mesifurane content were mapped as qualitative traits. Using a candidate gene approach we have assigned genes that are likely responsible for several of the QTLs. As a proof-of-concept we show that one homoeolog of the O-methyltransferase gene ( FaOMT ) is the locus responsible for the natural variation of mesifurane content. Sequence analysis identified 30-bp in the promoter of this FaOMT homoeolog containing putative binding sites for bHLH, MYB and BZIP transcription factors. This polymorphism fully co-segregates with both the presence of mesifurane and the high expression of FaOMT during ripening.


INTRODUCTION
Fruit flavor is a key characteristic for consumer acceptability and is therefore not surprising that its improvement is receiving increasing importance in strawberry breeding programs. Aroma compounds are key contributors to fruit flavor perception, which relies in a combination of taste, smell, appearance and texture (Taylor and Hort, 2004). Thus, the volatile composition of strawberry fruits has been extensively studied and more than 360 constituents have been reported, including esters, aldehydes, ketones, alcohols, terpenes, furanones and sulfur compounds (Latrasse, 1991;Zabetakis and Holden, 1997;Ménager et al., 2004;Jetti et al., 2007). Strawberry aroma increases rapidly as fruit ripens and differs both between species and cultivars due to different quantities and/or combination of many of this complex mixture of compounds (Ulrich et al., 2007). In addition, growing practices, seasonal variations and storage conditions also affect fruit volatile profile (Zabetakis and Holden, 1997;Forney et al., 2000).
Analysis of the aroma value (the ratio of compound concentration to odor threshold) has indicated that less than 20 compounds contribute significantly to strawberry flavor (Schieberle and Hofmann, 1997;Ulrich et al., 1997;Jetti et al., 2007). Esters, formed by esterification of alcohols and acyl-CoA, constitute the largest and one of the most important groups contributing to the aroma of strawberry fruit (Pérez et al., 1992;Pérez et al., 2002).
Major progress has been achieved in defining the pathways for volatile biosynthesis in plants and this has resulted in the identification of many genes encoding biosynthetic enzymes (Yamashita et al., 1976;Schwab et al., 2008;Klee, 2010;Osorio et al., 2010;Pérez and Sanz, 2010). A number of these loci have been used in transgenic approaches to engineer volatile production (Beekwilder, 2004;Lunkenbein et al., 2006a;2006b). However, the desired effects have not always been obtained, as regulation of metabolic flux or substrate availability might be important parameters influencing their biosynthetic pathway.
The main advantage of QTL analysis compared to the former approaches is that allows the identification of loci that, by definition, alter the target trait in natural (and mapping) populations. In addition, QTL analyses can contribute to increase our knowledge of the molecular mechanisms by which aroma compounds are regulated in fruits, which remain largely unknown (Aharoni, 2004;Klee, 2010).
High throughput assays are expensive and technically challenging and, furthermore, volatiles are environmentally influenced to a large extent. Despite existing challenges, a number of studies has addressed the analysis of QTLs affecting fruit volatiles. Tomato is by far the most studied fruit and a number of QTLs have been identified in intraspecific crosses (Causse et al., 2001;Saliba-Colombani et al., 2001;Zanor et al., 2009) and introgression lines (ILs) generated with crosses with wild relatives such as S. pennellii (Schauer et al., 2006;Tieman et al., 2006) and S. habrochaites (Mathieu et al., 2009). Other studies have investigated the genetic basis of aroma compounds by a QTL-based approach in apple (Zini et al., 2005;Dunemann et al., 2009), grape (Doligez et al., 2006;Battilana et al., 2009), rose (Spiller et al., 2010;Spiller et al., 2011) or Eucalyptus (O'Reilly-Wapstra et al., 2011). In strawberry, the inheritance patterns of key aroma compounds were examined in segregating populations of F. × ananassa and F. virginiana showing quantitative inheritance, typical of polygenic traits (Carrasco et al., 2005;Olbricht et al., 2008). However, to our knowledge, no loci controlling strawberry volatile compounds have been mapped to date, probably due to its octoploid constitution as well as its susceptibility to inbreeding depression.
virginiana (Darrow, 1966). A number of cytological genome models have been proposed for the octoploid species, but the most widely accepted to date is that of Bringhurst (1990), who proposed the genomic conformation octoploid Fragaria genomes and disomic inheritance. Up to four diploid ancestors have contributed to the genomes of octoploid strawberries, with an ancestor of F. vesca (2n=2x=14) being the maternal donor of the A sub-genomes (Rousseau-Gueutin et al., 2009). A high colinearity between the genomes of F. × ananassa and F. vesca has been reported (Rousseau-Gueutin et al., 2008;Sargent et al., 2009;Zorrilla-Fontanesi et al., 2011b). This, together with the availability of a comprehensive genome sequence and annotated gene predictions for the diploid species (Shulaev et al., 2011) will greatly facilitate genetic investigations in the cultivated strawberry.
Mapping of QTLs controlling fruit aroma and volatile levels and subsequent identification of linked molecular markers is an important goal for future marker-assisted selection (MAS) in strawberry. To accelerate the process of QTL identification, and gain insight into the biological mechanism, the candidate gene (CG) approach can be used to identify genes governing the amount of volatile compounds including those contributing to aroma (Pflieger et al., 2001). The aim of this study was to use the linkage maps of cultivated strawberry to locate selected CGs involved in aroma biosynthesis and to identify genomic regions controlling volatile compounds through QTL detection. Since strawberry is a highly heterozygous species, we used a F 1 population and a pseudo-backcross strategy to create separate parental maps (Grattapaglia and Sederoff 1994). The parental lines of the mapping population, '232' and '1392', differed, among other traits (Zorrilla-Fontanesi et al., 2011b), in the overall fruit flavor scores annotated during the breeding program of these two selections. In the present study, the parental and the 95 progeny lines were phenotypically evaluated for the content of individual volatile compounds in fruit purees using headspace solid phase micro-extraction coupled to gas chromatography and mass spectrometry (HS-SPME-GC-MS) over 4 successive years. A high number of major and stable QTLs were detected and co-segregation with CG playing a potential role in the variation of volatile compounds were also found. One of these associations was studied in more detail and expression studies as well as promoter sequence analysis in contrasting lines resulted in the identification of FaOMT as the gene responsible for the variation in mesifurane content, a key compound for strawberry flavor. Overall, this study gives important clues for understanding the genetic basis of aroma/flavor regulation in strawberry fruit.

Volatile profiling and analysis of the variation in the '232' × '1392' mapping population
Automated headspace solid phase micro-extraction (HS-SPME) sampling coupled to gas chromatographic separation produced chromatograms with more than 300 distinct peaks for each of the 4 assessed years. Among them, 87 volatiles including the majority of those previously shown to contribute to the aroma of strawberry could be identified using gas chromatography and mass spectrometry (GC-MS). Two compounds that have been reported as key for strawberry flavor, ethyl 2-methylbutanoate and furaneol, were not detected. For furaneol, this was due to its water-soluble nature and thermal instability (Pérez et al., 1996).
The relative content of these 87 volatiles in fruits of the parents and F 1 progeny over three of the four years (2007)(2008)(2009), along with their corresponding ID codes and descriptive statistics are shown in Table I. The parental lines '232' and '1392' displayed similar relative content for several volatile compounds, such as the majority of alcohols and esters, but line '1392' (selected for good flavor) displayed higher relative concentration of aldehydes, ketones, furans and terpenes. These differences were significant in the three years for 15 compounds ( Table I). The segregating progeny displayed even higher variation for most of the volatiles, with the exception of compounds 44, 53, 83 and 86 that showed little variation or very low content in the majority of individuals and therefore were not used for QTL analysis.
Hierarchical cluster analysis (HCA) using the results from the 4 seasons of volatile profiles of parental lines as well as those of the F 1 progeny was used to further investigate the relationship between both compounds and individuals of the population (Fig. 2). This analysis grouped the volatiles into three distinct clusters (A-C), each of them containing biosynthetically related aroma compounds. This result validates our analysis and further reveals the complex metabolic network underlying volatile production in strawberry fruit.
Cluster A grouped ~36% of the identified volatiles and was enriched in esters, which are quantitatively the main contributors to the aroma of strawberry fruit. Cluster A also included 3 alcohols; 1-decanol (70), 1-octanol (50) and eugenol (74) (38) or (E)-2hexenyl acetate (41), 1-hexanol (18) and 4 esters of butanoic acid (Fig. 2). Clustering of esters and alcohols is an expected result, since esters are enzymatically synthesized by coupling the respective acids and alcohols (Yamashita et al., 1977). Volatile profiles 1 0 displayed considerable variation between different years (Table I), suggesting an important influence of environmental factors. However, in the hierarchical cluster analysis samples from the same line in the four different years were in general closely associated, indicating that, despite a clear environmental influence, the genotypic variation may be sufficient for QTL detection ( Fig. 2 and Fig. S2).

Cluster validation using correlation analyses
For a deeper understanding of the production of volatile compounds in strawberry, a correlation-based approach was adopted, which has been shown as an useful tool to gain insight into metabolic pathways and networks (Raamsdonk et al., 2001;Weckwerth et al., 2004). To identify co-regulated compounds in the population, the pair-wise correlation for each volatile was analyzed against every other volatile. Pearson correlation coefficients were calculated for year 2008 (Supplemental Table S1) and the corresponding heat map representation and HCA is shown in Fig. S3. Of the 7,569 possible pairs analyzed, 2,558 resulted in significant correlations (P < 0.05). Of these pairs, most of them (2,176) showed positive correlation coefficients and only 382 showed negative correlation coefficients. The highest negative correlations were found between 2-methylbutyl acetate (20) and other three volatiles: hexyl hexanoate (75; r = -0.52), ethyl butanoate (10; r = -0.39) and nerolidol (84; r = -0.38). Negative correlations were also found between alcohols and esters, as for example between 1-hexanol (18) and butyl hexanoate (60; r = -0.30) or octyl hexanoate (85; r = -0.29). By contrast, high positive correlations were found between the alcohols (E)-2-hexen-1-ol (17) and 1-hexanol (18; r = 0.86), between the esters (Z)-3-hexenyl acetate (38) and (E)-2-hexenyl acetate (41; r = 0.78), between the aldehydes (E)-2-heptenal (29) and (E)-2octenal (49; r = 0.87) or nonanal (55) and decanal (64; r = 0.81) and between a group of 12 esters, whose correlation coefficients ranged between 0.65 and 0.85. As previously reported in tomato (Zanor et al., 2009), a high positive correlation was found between the terpenes linalool (54) and terpineol (65; r = 0.77). Since all these strong pair-wise correlations involve volatiles that share a common structure and belong to the same family, a likely explanation is that they are in the same biochemical (biosynthetic) pathway and/or display mutual control by a single enzyme. However, it should be noted that non-neighboring metabolites may be highly correlated and, by contrast, metabolites that participate in common reactions may not always exhibit significant correlation (Steuer et al., 2003;Camacho et al., 2005).

1
HCA based on pair-wise correlations distinguished four major clusters of biosynthetically related compounds, which were highly similar to those found in the HCA of lines and metabolites. Some of them could be divided into minor sub-clusters containing volatiles with remarkable associations from a metabolic point of view (Fig. S3). The first cluster (A) contained mainly esters of acetic acid and alcohols, such as (E)-2-hexen-1-ol (17) or 1-hexanol (18), and comprised ~16% of the identified volatiles (Fig. S3). The second large group (cluster B) contained ~24% of the identified compounds and it essentially included aldehydes, such as (E)-2-heptenal (29), (E)-2-octenal (49), nonanal (55), decanal (64) or (E)-2-decenal (69; Fig. S3). Cluster C comprised about 26% of the volatiles, with a sub-cluster including the high and positively correlated terpenes linalool (54) and terpineol (65; Fig. S3). Finally, the other sub-cluster C and cluster D contained ketones, furans and the majority of esters (Fig. S3). In theory, compounds derived from the same precursor or synthetized by the same enzyme should cluster together and therefore, QTLs controlling the variation of one of them may also control the variation of the others.

Genetic mapping of QTLs controlling aroma compounds in strawberry fruits
For the QTL analyses, additional markers, including a number derived from aroma candidate genes (Table II) (Table III).
The '1392' map consisted of 227 markers distributed in 36 LGs with a cumulative length of 869 cM (Table III). The integrated map included 363 markers distributed in 39 LGs and spanned a cumulative length of 1,400 cM (Table III and  . In agreement with their common enzymatic reaction, QTLs for different esters frequently co-located with QTLs controlling alcohols. They clustered in HGI, HG III, HG V and HG VI (Fig. 3). As an example, a QTL for 1-octanol in HG VI (50VI-1) stable in all three years co-located with QTLs for octyl acetate (63VI-1), octyl butanoate (76VI-1) and octyl hexanoate (85VI-1), also stable in all three years. These volatiles grouped together in the HCA and were significantly correlated  Table S1).
Between 1 and 3 QTLs have been identified per volatile trait, with the percentage of phenotypic variation (R 2 ) explained by each QTL ranging from 14.2 to 92.8% (Supplemental Tables S2-S4). One major QTL was detected for 31 volatiles and between 2 and 3 QTLs were detected for the remaining 17 compounds. This high proportion of major QTLs suggests that variation in strawberry fruit aroma is regulated by a limited set of loci with a high effect rather than multiple loci with reduced effects, in contrast to the regulation of other agronomic and fruit quality traits (Zorrilla-Fontanesi et al., 2011b). Mesifurane (48) and γ -decalactone (82) displayed qualitative variation suggestive of single locus inheritance.
Mesifurane was detected in fruits of both parental lines and about 25% of the progeny presented concentrations of this compound close to the detection limit in the 4 seasons (expected 3:1 ratio; p=0.36). In contrast to mesifurane, γ -decalactone was present in fruits of only one parental line, '1392', and the observed segregation pattern in the progeny matched 1 3 the expected 1:1 ratio (p=0.76). The two traits were scored as presence or absence in the segregating population and mapped accordingly. Mesifurane and γ -decalactone were also mapped as QTL and shown to colocalize to LG VII-F/M.2 and LG III-M.2, respectively ( Fig. 3; Fig. S1). For γ-decalactone, the QTL explained about 90% of the total variance, corroborating our single gene hypothesis (Supplemental Table S4). The detected QTL for mesifurane explained from 42% up to 67.3% of the phenotypic variance, indicating a strong effect of this locus in the control of total variation (Supplemental Table S4).

Association of candidate genes with aroma QTLs
With the exception of FaOMT (see below), none of the aroma candidate genes mapped in this study (Table II) located within the QTL intervals, or if they co-located they were not functionally related to the corresponding volatiles. However, most of the markers used in the elaboration of these maps were derived from ESTs obtained from strawberry fruit. This opens the possibility of identify candidate genes that account for the QTL.
Among those markers, and hence genes that could be responsible for the variation in related compounds, we found marker ChFaM149, located in the ripening up-regulated cinnamyl alcohol dehydrogenase (CAD) gene (Blanco-Portales, 2002). This gene mapped to the LOD peak or within the confidence interval of QTLs for methyl hexanoate, pentanoate, benzoate and benzyl acetate (14I-1, 28I-1, 56I-1 and 59I-1; Supplemental Table S2 and Fig. 3). Two other ESTs within the confidence intervals of QTLs have homology to transcription factors with possible regulatory effects. One of these markers was ChFaM083 (in a putative zincbinding transcription factor) mapped to LG I-2, inside the confidence interval of QTLs controlling esters (14I-2, 19I-2 and 73I-2), eugenol (74I-2) and terpenes (65I-2 and 66I-2), all of them stable in two or all three years (Supplemental Tables S2-S3 mapped to the bottom of LG VII-F.1, linked to marker ChFaM160. The QTL 48VII-2 controlling mesifurane content was mapped at approximate the same position in another LG of the same HG VII (Supplemental Table S4  4B) presumably being a non-functional allele that is not or very lowly expressed. The fact that half of the lines in the population produced mesifurane and presented both bands indicates the dominance of the functional allele, consistent with being an expression QTL (eQTL).

Analysis of potential cis-regulatory elements in the FaOMT promoter sequence
Databases of known position-specific scoring matrices (PSSMs) were used to search for putative transcription factor binding motifs present in the FaOMT promoter sequences (detailed in material and methods). Then we focused in those putative binding motifs present in the promoter of the active alleles but missing in the promoters of the inactive alleles. The most relevant cis-regulatory elements and their position relative to the promoter sequence of the active 93-62 allele are listed in Table IV. The analysis detected a putative TATA-box at -131 bp from the ATG start codon and potential cis-regulatory elements associated with hormone, light and stress-related responses in all promoters. These cis-regulatory elements of the 4a and 4b promoters are depicted in Fig. 4C. The promoters were particularly enriched in light responsive elements such as Sp1, G-Box, I-box, and GT-1, suggesting that FaOMT could be tightly regulated by light (Terzaghi and Cashmore, 1995;Toledo-Ortiz et al., 2003). In addition, a number of hormone-responsive motifs were identified, with 3 motifs related to auxin regulation (one TGA-element and two AuxRR-core motifs) and one GARE-motif implicated in gibberellin responsiveness.
The potential motifs that were found in the functional allele but missing in the rest of the sequences concentrated in the 30-bp indel region (from -276 to -220 bp from the ATG) and included (1) an E-box/RRE motif, (2) a potential MYBL motif and (3) a sequence with high homology to an ABRE motif and to an ACGT-containing element (Table IV, Fig. 4C and Supplemental Fig. S4). Therefore, it is likely that these specific motifs are responsible for driving high FaOMT expression in strawberry fruit. The region around the 30-bp indel was not conserved between the FvOMT promoter and the functional FaOMT allele (Fig. S4), suggesting that this allele might not be expressed in F. vesca red ripe fruits. Therefore, we analyzed the expression of FvOMT in red fruits, leaves and roots of F. vesca and the commercial F. × ananassa cv. Camarosa (Fig. 5A). While only high expression of FaOMT was found in red fruits of 'Camarosa', low expression of FvOMT was found in all tissues of  To further investigate the function of FaOMT, we analyzed separately the expression in the receptacle and the achene at different stages of fruit ripening in the strawberry cultivar 'Camarosa' (Fig. 5B). This analysis showed different expression patterns in each organ.
FaOMT expression increased during ripening in the receptacle tissue while in the achene, the highest expression was observed in green fruit, decreasing later during ripening. This expression pattern is consistent with the role of FaOMT in the biosynthesis of mesifurane in the receptacle but also with the additional role that has been proposed in lignin biosynthesis (Lunkenbein et al., 2006a), which might be more relevant in the achene.

Variation in volatile compounds in '232' × '1392'
The relative content of 87 volatiles identified by gas chromatography and mass spectrometry was analyzed in the parental lines and F 1 progeny over four successive years.
In agreement with their quantitative contribution to the aroma of strawberry fruit, the majority of the volatiles were esters (49.4%), aldehydes and alcohols (27.6%), followed by several ketones, terpenes and furans (23.0%). All these compounds are known to occur in strawberry fruit and many shown to contribute to its aroma (Pérez and Sanz, 2010).
Differences in their relative concentration in each line in the four assessed years are most probably due to environmental factors. Nevertheless, remarkable differences between genotypes exist, which is in accordance with previous studies performed in apple and strawberry, where volatile profiles were more dependent on genotype than on environmental conditions (Fellman et al., 2000;Forney et al., 2000). Most of the volatiles showed a distribution typical of polygenic inheritance and, generally, levels of compounds in the F 1 individuals showed transgressive behavior. HCA based on both volatile profiles and correlation data grouped aroma compounds in similar clusters. Thus, volatiles belonging to the same biochemical pathway were normally grouped in the same cluster, suggesting a coregulation of these metabolites.
Mapping volatile compound content as single Mendelian traits have also been used for the genetic dissection of scent metabolites in diploid roses, where nerol and neryl acetate 1 7 were mapped as single traits in the rose genome and geranyl acetate was mapped as an oligogenic trait controlled by two independent loci (Spiller et al., 2010). In agreement with our data, qualitative differences in mesifurane content have been reported in an analysis of five strawberry cultivars . Similarly, γ -decalactone has been described as an important cultivar-specific volatile in strawberry Schieberle and Hofmann, 1997;Ulrich et al., 1997) and was detected in 44% of the progeny generated by crossing two strawberry cultivars that strongly differed in flavor (Olbricht et al., 2008). In accordance, γ -decalactone was only detected in one parental line ('1392 ) and in approximately half of the progeny. Therefore, markers linked to the locus controlling γ -decalactone in LG III-2 might be useful tools for future MAS, since it explained up to ~93.3% of the phenotypic variation and this compound confers a pleasant peach-like flavor note to strawberry fruit .

QTLs controlling the aroma of strawberry and associated candidate genes
Two different QTL detection methods, the non-parametric K-W test and IM, were employed to map a large number of loci controlling aroma compounds in the strawberry population '232' × '1392'. Although the first method is less powerful than IM, it allowed the detection of significant associations between marker genotypes and raw phenotypic data, confirming most of the QTLs detected using IM. QTLs for ~55% of the identified volatiles were detected in this study and 50% of them were stable in two or all three analyzed years.
Most of the QTLs (50.3%) controlled ester production, which are quantitatively the main contributors to the aroma of strawberry fruit.

Fragaria
LGVI (Gar et al., 2011;Iwata et al., 2012). Taken together, these results indicate that orthologous loci could control the content of different terpenes in these two closely related genera within the Rosaceae. If this is the case, the gene responsible would most probably be acting early in the terpene biosynthetic pathway or may display a broad substrate affinity.
In this study, several CGs involved in strawberry aroma biosynthesis were mapped and some found associated with QTLs for specific volatiles. Interestingly, a cluster of QTLs controlling 4 phenyl-derived esters, eugenol and terpineol co-located in LGs I-1 and I-2 with CAD homoeologs. The enzyme cinnamyl alcohol dehydrogenase (CAD) catalyzes the reversible conversion of cinnamyl aldehydes to the corresponding alcohols, the last step in the biosynthesis of monolignols (Singh et al., 2010). Strikingly, CAD has been reported to be expressed even in cells that do not make lignin (Singh et al., 2010). Although CAD expression pattern was related to vascular bundle formation in strawberry fruit (Aharoni, 2002), it has been suggested that cinnamyl alcohol derivatives produced by CAD activity may be implicated in fruit flavor and aroma (Mitchell and Jelenkovic, 1995;Singh et al., 2010). Thus, an association between CAD and phenylpropanoid derived volatile compounds could be explained by a higher CAD activity redirecting and increasing the substrates for production of phenylpropene compounds such as eugenol. However, it cannot be excluded that in addition to cinnamyl alcohol, CAD may have affinity for other substrates in strawberry receptacle, as has been shown for aromatic and terpene alcohols (Mitchell and Jelenkovic, 1995). This possibility deserves further investigation in order to determine whether CAD plays a role in the production of diverse volatile compounds.

9
The largest cluster of aroma QTLs, involving 16 different volatiles, was mapped to LG VI-1. Among them, 5 acetate esters co-located with QTLs for other related esters, such as butyl and octyl hexanoates and butanoates, and two alcohols, 1-octanol and 1-decanol.
The majority of these compounds was grouped in cluster D and were also highly correlated between each other ( Fig. 2; Fig. S3). We could speculate that the locus underlying this common QTL might be controlling the content of a common precursor in their biosynthesis.
In agreement, substrate availability is a key limiting factor in the synthesis of volatiles and the role of substrates in their regulation is currently under study (Dudareva et al., 2004).
Similarly to our results, the content of a number of acetate esters correlated in apple and a number of them have been mapped to MG9, which is syntenic to a region of FGVI (Dunemann et al., 2009;Rowan et al., 2009a;2009b;Illa et al., 2011a;2011b;). Increasing the resolution of the '232' × '1392' maps and a search of candidate genes in the confidence interval of the QTL in the F. vesca genome is under way. Since the QTL could be conserved between strawberry and apple, the identification of common candidate genes in the intervals of these QTLs in F. vesca and apple genomes would reduce the number of potential candidates and might hasten the identification of the underlying gene.

The gene FaOMT controls the variation in mesifurane content
We have shown that a QTL controlling mesifurane content (48VII-2) occurs at the same location as one of the 4 homoeologs of the FaOMT gene in LGVII-2. The substrate specificity of FaOMT indicates that it is involved in the methylation of furaneol to mesifurane (Wein et al., 2002). Furthermore, the inhibition of FaOMT in transgenic strawberry plants resulted in a near total loss of mesifurane (Lunkenbein et al., 2006a). This data together with the dramatic reduction of FaOMT expression in lines with trace content of mesifurane provide strong evidence indicating that FaOMT is the locus controlling mesifurane content in strawberry fruit and is responsible for its natural variation.
The observed size and sequence variability in the examined OMT (FaOMT and

CONCLUSIONS
This study provides a genetic map of QTLs that represent a useful resource for the identification of the loci responsible for the variation of a number of volatiles in strawberry fruit. Some QTLs control the variation of volatiles that contribute significantly to the aroma/flavor of fruits but others may control volatile compounds relevant to plant survival, defense against pathogens or to plant-plant interaction. Some of them were mapped to welldefined regions and the availability of the genome sequence of F. vesca as well as the future sequencing of octoploid strawberry will allow the identification of many of the genes underlying these QTLs. Many of the QTLs identified explained a large proportion of the phenotypic variation and were stable over different years. Thus, associated molecular markers will represent useful tools for the selection of genotypes with enhanced concentrations of important aroma volatiles using molecular breeding approaches. QTLs identified in the genetic background of this strawberry population will gain from further studies involving other strawberry cultivars or even other Fragaria species to better understand the genetic architecture of strawberry aroma. Furthermore, since many volatile compounds are common to different important crops and ornamental species, QTLs identified in strawberry will facilitate advances in other species.
Using genetic, metabolomic and molecular approaches we have identified functional and inactive alleles of the gene FaOMT and shown that the expression of this gene is responsible for the natural variation in mesifurane content in strawberry fruit. Since the substrate of this enzyme, furaneol, is considered among the most important compounds influencing strawberry aroma, the selection of non-functional alleles of FaOMT using the marker here developed will be desirable in new elite cultivars. From a biotechnological perspective, the 30-bp indel might provide an important tool in order to engineer promoters able to drive high and specific expression in the receptacle during fruit ripening of selected genes.

Plant material
The F1 mapping population, comprising 95 progeny lines, was raised from the cross × FB used in this study has been described in Sargent et al. (2008).

Sample preparation
To analyze the aroma profiles of fruit purees of the octoploid strawberry population, 10-15 fully ripe fruits were harvested the same day (at the middle of the season) from the parental and each of the 95 F 1 lines in each of four successive years (2006, 2007, 2008 and 2009).
Fruits were immediately cut, frozen in liquid nitrogen and stored at -80°C. Later, fruits were powdered in liquid nitrogen using a coffee grinder and stored at -80°C until GC-MS analyses. Prior to the analysis of volatile compounds, frozen fruit powder (1 g fresh weight) of each sample was weighed in a 7 mL vial, closed, and incubated at 30°C for 5 min. Then 300 µL of a NaCl saturated solution were added. 900 µL of the homogenized mixture were then transferred to a 10 mL screw cap headspace vial, from where the volatiles were immediately collected.

Automated Headspace Solid Phase Micro-Extraction, Gas Chromatography Separation and Mass Spectrometry Detection (HS-SPME-GC-MS)
The volatiles were sampled by headspace solid phase micro-extraction (HS-SPME; mode. Incubation of the vials, extraction and desorption of the volatiles were performed automatically by a CombiPAL autosampler (CTC Analytics). Chromatography was performed on a DB-5ms (60 m x 0.25 mm x 1 µm) column (J&W Scientific) with Helium as carrier gas at a constant flow of 1.2 mL/min. GC interface and MS source temperatures were 260°C and 230°C, respectively. Oven temperature conditions were 40°C for 3 min, 5°C/min ramp until 250°C and then held at 250°C for 5 min. Mass spectra were recorded in scan mode in the 35 to 220 m/z range by a 5975B mass spectrometer (Agilent Technologies) at an ionization energy of 70 eV and a scanning speed of 7 scans/s. Chromatograms and spectra were recorded and processed using the Enhanced ChemStation software (Agilent Technologies).

Compound Identification and relative quantification
Compounds were unequivocally identified by comparison of both mass spectrum and retention time to those of pure standards (SIGMA-Aldrich) except 2-(1-pentenyl)furan, which was tentatively identified by comparison of its mass spectrum with those in the NIST05 library. Peak areas of selected specific ions were integrated for each compound.
Then, they were normalized by comparing with the peak area of the same compound in a reference sample injected regularly (a mixture of all the samples from the mapping population for each year) to correct for variations in detector sensitivity and fiber aging.
Data were expressed as the relative content of each metabolite compared to the reference sample.

Statistical Analysis
Descriptive statistical analysis of the identified compounds was performed using different modules of the STATISTICA 7.0 software package (StatSoft, Inc. 2007). Range of variation in the F 1 progeny, skewness and kurtosis were calculated only for the last three years (2007)(2008)(2009), and the Shapiro-Wilk test (Shapiro and Wilk, 1965) was applied to test normality of trait distributions. For those volatiles deviating from normality, several transformations (Ln, Log 10 , Log 2 , inverse of square root, square root, square, cube, reciprocal and arcsine in degrees or radians) were tested and the transformation that gave the least skewed result was used in the subsequent QTL analysis.
Two of the identified compounds (γ-decalactone and mesifurane) were resolved into single Mendelian traits and analyzed as both a single gene and a quantitative trait. To analyze γ -decalactone as a major gene, genotypes with relative values higher or lower than 0.05 (meaning 20 times less than the reference sample) were considered as producing or not producing γ -decalactone, respectively. For mesifurane, although lines considered not producing mesifurane contained in general less than 1/20 the content in the reference sample, the limit for scoring the lines as producing or not producing was established at 0.1 (meaning 10 times less than the reference sample).

Candidate gene analysis
DNA for molecular markers and candidate gene analyses was isolated from young leaves of the parents and F 1 mapping population using a modified CTAB method based on that of Doyle and Doyle (1990). temperature for each primer pair (Table II). PCR products, which ranged between 167 and 402 bp in length, were loaded and electrophoresed in agarose and non-denaturing gels, as described (Zorrilla-Fontanesi et al., 2011a;2011b). At least two different SSCP bands from each gene were picked from the gels, amplified and directly sequenced to verify their identity. The microsatellite EMFv010 (James et al., 2003) was also amplified in the octoploid mapping population in order to increase saturation in HG VI.

Linkage mapping
Polymorphic SSR and SSCP bands plus the two volatile compounds resolved as Mendelian loci were scored by two different observers. Then, the χ 2 analysis for goodness of fit was performed to test the segregation ratios obtained to those expected for single and multiple dose markers under disomic or octosomic inheritance (Lerceteau-Kohler et al., 2003).
Markers were considered to have significantly skewed ratios at P ≤ 0.05.
Linkage analyses and map construction were performed using JoinMap ® 4 (van Ooijen, 2006) and the population coded as CP. Two independent parental maps and the integrated map were constructed using the novel molecular markers and those previously located in the '232' × '1392' map (Zorrilla-Fontanesi et al., 2011b). To generate the maps, a double pseudo-testcross strategy was employed, including 1:1, 3:1 and codominant markers segregating from each parental line (Grattapaglia and Sederoff, 1994). Grouping was performed using independence LOD and the default settings in JoinMap. Groups were generally chosen from a LOD of 5.0-8.0, although for some groups this value was decreased

QTL analysis
Because of the high number of traits and the different origin (seed in the first year and vegetatively propagated plants in the second, third and fourth years), we decided to exclude the data obtained in the first year (2006). QTL analyses were performed using MapQTL ® 5 (van Ooijen, 2004) on data from the last three years (2007)(2008)(2009)). Due to non-normality for most of the metabolites, the raw relative data for a total of 83 volatile compounds (44,53,83 and 86 were excluded) were analyzed first by the non-parametric Kruskal-Wallis (K-W) rank-sum test. A stringent significance level of P = 0.005 was used as threshold, as suggested by van Ooijen et al. (2004). Second, different data transformations were tested with the aim of identifying the most appropriate in order to achieve normality (see statistical analysis section). This is preferred for subsequent QTL analysis based on interval mapping (IM). Lastly, the genetic linkage map of each parental line and transformed data sets for most traits were used to identify and locate QTLs using interval mapping (IM; Lander and Botstein, 1989). Identified QTLs were described by the marker with the highest significance level in the corresponding QTL region. For IM, the all-markers mapping approach was used to upgrade marker information (Knott and Haley, 1992;Maliepaard and van Ooijen, 1994).
This method employs not only the flanking markers but also markers from neighboring intervals to calculate the probabilities of a QTL. Five neighboring intervals and a step size of 3 cM were used. Significance LOD thresholds were estimated with a 1,000-permutation test (Churchill and Doerge, 1994) for each volatile and year on each map and QTLs with LOD scores greater than the genome-wide threshold at P ≤ 0.05 were declared significant. For each LOD peak, the 1-LOD support interval was determined (van Ooijen, 1992). The percentage of variance explained by each QTL and the genotypic information coefficient (GIC, which ranges from 0 to 1, with 0 indicating no marker information and 1 complete or maximum marker information) were also calculated. QTLs were named in italics using the volatile code followed by the name of the LG in which the QTL was located. QTL positions and 1-LOD confidence intervals were drawn using MapChart 2.2 for Windows.

Gene expression analysis
Total RNA was extracted from different tissues of F. × ananassa and F. vesca 'Reine des vallées (acc. IFAPA660) as previously described (Manning, 1991). Prior to reverse transcription, RNA was treated with DNase I (Fermentas) to remove any residual contaminating genomic DNA. First-strand cDNA was synthesized using the i-script kit (Bio- Rad) and following the procedure described by the manufacturer. Two microliters of a dilution of the reaction product was subjected to subsequent semi-quantitative RT-PCR in a 20 μl reaction volume and 25 PCR cycles. Primers used to amplify the Farib413 (18S-26S inter-spacer ribosomal gene) constitutive control were described elsewhere . Sequences of primers used to analyze FaOMT expression are shown in Table II and they amplify a fragment of the 3' UTR of the cDNA. This primer pair amplified with approximate the same efficiency a product of the same size using F. vesca or F. × ananassa genomic DNA as template (data not shown). The amplification products were separated on 1.5% agarose gel stained with ethidium bromide and visualized with UV light. For analysis of FaOMT expression in contrasting lines of the mapping population, RNA was extracted from the same pool of fruit tissue used for volatile profiling in season 2009. For the rest of expression studies, two independent biological replicates were assessed. The same results were obtained and only one is shown.

Promoter isolation and analysis
For characterization of the FaOMT promoter region from F. × ananassa lines, we sequenced three independent clones obtained from (1)  Sequence analyses and comparisons were carried out using the BioEdit software.

Supplemental Data
The following materials are available in the online version of this article.  Table S1. Pair-wise correlations among the 87 volatiles identified in the '232' × '1392' mapping population for year 2008. Table S2. QTLs controlling the content of esters. Table S3. QTLs controlling the content of alcohols and terpene alcohols. Table S4. QTLs controlling the content of aldehydes, ketones and furans.      PC1 and PC3 (right). Volatile compounds that accounted most for the variability of aroma profiles across PC1, PC2 and PC3 are highlighted in green, blue and yellow circles, respectively. For volatile codes see Table 1. PC1, PC2 and PC3: first, second and third principal components, respectively.  (2006)(2007)(2008)(2009). Individuals with a relative content for a given compound similar, lower or higher than that of the reference sample are shown in black, green or red, respectively. Clusters of volatiles are indicated by different letters.