Expression dynamics of the Medicago truncatula transcriptome during the symbiotic interaction with Sinorhizobium meliloti: which role for nitric oxide?

Medicago truncatula is one of the most studied model plants. Nevertheless, the genome of this legume remains incompletely determined. We used RNA-Seq to characterize the transcriptome during the early organogenesis of the nodule and during its functioning. We detected 37,333 expressed transcription units; to our knowledge, 1,670 had never been described before and were functionally annotated. We identified 7,595 new transcribed regions, mostly corresponding to 5' and 3' untranslated region extensions and new exons associated with 5,264 previously annotated genes. We also inferred 23,165 putative transcript isoforms from 6,587 genes and measured the abundance of transcripts for each isoform, which suggests an important role for alternative splicing in the generation of proteome diversity in M. truncatula. Finally, we carried out a differential expression analysis, which provided a comprehensive view of transcriptional reprogramming during nodulation. In particular, depletion of nitric oxide in roots inoculated with Sinorhizobium meliloti greatly increased our understanding of the role of this reactive species in the optimal establishment of the symbiotic interaction, revealing differential patterns of expression for 2,030 genes and pointing to the inhibition of the expression of defense genes.

Legumes (Fabaceae) are of considerable socioeconomic importance, accounting for one-third of the world's primary crop production (Graham and Vance, 2003). These plants are characterized principally by their ability to reduce atmospheric nitrogen through the establishment of a symbiotic association with bacteria called rhizobia. This ability to fix atmospheric nitrogen allows these plants to grow on soils that are deficient in nitrogen, decreasing both the need for costly nitrogen fertilizers and the water pollution they may cause (Graham and Vance, 2003).
A new organ, the root nodule, is formed to house this symbiosis. Within this organ, there is an exchange of nutrients: the bacteria provide the plant with ammonia, and the plant provides the bacteria with carbohydrates. Root nodule formation in the symbiotic relationship involving the model plant Medicago truncatula and its bacterial symbiont Sinorhizobium meliloti involves several steps. An exchange of signals induces cell division in the inner root cortex and the formation of a nodule primordium (Long, 2001). In parallel, rhizobia infect the root via infection threads initially formed in root hairs (Jones et al., 2007). The rhizobia are then taken up by endocytosis, forming organelle-like structures called symbiosomes and differentiating into nitrogenfixing bacteroids. On histological examination, the final nodule can be seen to consist of several different zones: a meristematic zone I, an infection zone II, a fixation zone III, and, in older nodules, a senescence zone IV (Oldroyd and Downie, 2008). Many factors have been shown to control the nodulation process (Oldroyd and Downie, 2008), but nitric oxide (NO), which was initially reported to be involved in plant-pathogen interactions (Delledonne et al., 1998;Durner et al., 1998), seems to play an important role (Shimoda et al., 2005;del Giudice et al., 2011). NO depletion has been shown to cause the downregulation of plant genes involved in nodule development and to delay nodule formation (del Giudice et al., 2011).
Over the past 10 years, hundreds of plant and bacterial genes displaying differential expression during the nodulation process have been identified by transcriptome analyses in M. truncatula and S. meliloti El Yahyaoui et al., 2004;Capela et al., 2006;Starker et al., 2006;Benedito et al., 2008;Jones et al., 2008;Maunoury et al., 2010;Moreau et al., 2011). Studies of bacterial and plant mutants have led to the identification of plant genes involved in early stages of the symbiotic interaction (Mitra and Long, 2004). In one study, complementary DNA (cDNA) microarrays carrying 2,366 single genes were used to analyze M. truncatula nodule formation (Maunoury et al., 2010). This study demonstrated that a two-stage reprogramming of gene expression takes place during nodule formation. More recently, more than 3,400 differentially regulated genes and associated regulators have been identified in studies with a 16.4 K 70-mer oligonucleotide microarray (Moreau et al., 2011); this analysis led to the definition of four different stages of transcription reprogramming during the course of nodulation. Additional resources relevant to M. truncatula functional genomics include the M. truncatula Gene Expression Atlas (http:// mtgea.noble.org/v3/), which provides developmental expression data for most of the genes of M. truncatula (Benedito et al., 2008). However, the picture remains incomplete despite the availability of all these data, because all the studies to date have been carried out with microarrays based on an incomplete genome. The genome sequence of M. truncatula is currently being annotated by the International Medicago Genome Annotation Group, which recently reported a draft sequence of the M. truncatula euchromatin based on a bacterial artificial chromosome assembly supplemented with an Illumina shotgun sequence. On the basis of alignments with ESTs, the authors estimated that about 94% of all M. truncatula genes have now been captured (Young et al., 2011). In total, 64,388 gene loci were described in version 3.5 of the genome sequence, in which 14,322 of the predicted genes were annotated as transposons (Young et al., 2011). However, the annotation of the M. truncatula genome remains incomplete.
The development of high-throughput DNA sequencing technologies has provided new methods for the characterization and quantification of transcriptomes (RNA-Seq) at the single-nucleotide level. Here, we used the RNA-Seq approach for transcriptome characterization during early nodule organogenesis and nodule functioning. Our data significantly improve the annotation of the M. truncatula genome by identifying 7,595 new transcribed regions in the radius of known genes, 139 new genes from previously annotated intergenic regions, and 1,670 new transcription units (TUs) not mapped on the reference genome. We also paid particular attention to alternative splicing (AS) and to the role of NO in the optimal establishment of symbiosis.

High-Throughput Sequencing and Mapping of the M. truncatula Transcriptome
We profiled the nodulation transcriptome by isolating mRNA from noninoculated nitrogen-starved roots (referred to hereafter as "noninoculated roots"), roots 4 d after inoculation ("inoculated roots"), and 21-d-old nodules ("nodules"). We also investigated the role of NO in modulating transcriptional reprogramming during the establishment of symbiosis by isolating mRNA from inoculated roots treated with the NO scavenger 2-(4-carboxyphenyl)-4,4,5,5-tetramethylimidazoline-1-oxyl-3-oxide (cPTIO). We harvested only the 2-cm region of the roots in which most of the nodule primordia appeared (see "Materials and Methods"; Supplemental Fig. S1). For each set of conditions, we used two biological replicates to generate independent libraries for RNA-Seq that were sequenced separately with an Illumina genome analyzer II.
We generated approximately 135.5 million high-quality 36-bp reads (Supplemental Table S1), which were then aligned with the M. truncatula genome sequence (Mt3.0 version) and with sequences from a custom splicejunction database, for the detection of transcribed regions and splicing sites. This provided information about the accuracy of gene predictions, the extent and pervasiveness of genome transcription, and the frequency of AS events. We identified a unique map position on the reference genome for 54.6% of the 135.5 million reads (Table I). About 8% of these reads mapped to known alternatively spliced junctions. We also analyzed the distribution with respect to genes of the 74 million uniquely mapped reads. We found that most (78.8%) mapped to protein-coding genes ( Table I). The others mapped to introns (1.25%), untranslated regions (UTRs; 12%), and intergenic regions (6.2%). The proportion of reads mapping to UTRs and new exons was lower for inoculated roots and nodules than for noninoculated roots, whereas no such difference was observed in cPTIOtreated inoculated roots.

New TUs and AS
For improvement of the current annotation of the M. truncatula genome, we first clustered the reads mapping to nonannotated regions into 7,875 putative new transcribed regions. Most of these regions (7,595) were within 3,000 nucleotides of 5,264 known genes ( Fig. 1; Supplemental Table S2). The remaining newly identified transcribed regions were used to define 139 new TUs (Supplemental Table S3). Functional annotations were obtained for 61 of these TUs. Differences in the number of reads corresponding to these new transcribed regions were observed between conditions (Fig. 1, B and C; Supplemental Table S2), with almost 30% fewer reads mapping to new 59 and 39 UTR extensions in inoculated roots and nodules than in noninoculated roots. No such difference was observed in cPTIO-treated inoculated roots (Fig. 1B).
A large proportion of the reads (38%) could not be mapped to the Mt3.0 version of the M. truncatula reference genome, probably due to incomplete genome coverage in this version of the reference sequence. We assembled these unmapped reads into 16,335 TUs of 101 to 5,984 nucleotides in size. Most (12,071) matched with M. truncatula Gene Index sequences (tentative consensus, full-length cDNA sequences, and singleton ESTs from The Institute for Genomic Research database) and clustered into 7,122 TUs. We obtained a final set of 11,434 TUs that were also mapped against the new RNA-Seq data accompanying the Mt3.5 version of the M. truncatula genome (Young et al., 2011). Finally, 1,670 TUs could not be mapped to any known M. truncatula Gene Index sequence, Medicago Affymetrix probe set, or RNA-Seq data (Mt3.5). These 1,670 TUs without any correspondence in public databases were considered to be new TUs. Functional annotations were obtained by BLAST2go interrogation (http://www.blast2go.com) of National Center for Biotechnology Information nonredundant databases for only 881 of these new TUs (E value # 1 3 10 28 ).
We analyzed splice variants with Cufflinks (Trapnell et al., 2010), which is not restricted by prior gene annotation and therefore can be used to identify completely new isoform transcripts and to determine their abundance. Despite limitations due to the lack of paired reads, we identified 6,587 genes with a minimum of two isoforms, giving a total of 23,165 different transcripts (Supplemental Table S4). Of these, 8,557 were known isoforms referenced in the available M. truncatula transcriptome annotation (Mt3.5), whereas 14,608 were new. In addition, 2,457 of these newly identified isoforms presented a reference intron fully covered by the reads, indicating intron retention events. The relative abundance of each of the isoforms, reported in expected fragments per kilobase of transcript per million fragments sequenced, revealed that the multiple gene isoforms were often coexpressed in the same organ but regulated differently during nodulation (Supplemental Tables S5 and S6). The frequency of statistically significant splicing events in cPTIO-treated inoculated roots was more than twice that in untreated inoculated roots (Supplemental Table S7).
Global Analysis of the M. truncatula Transcriptome during Symbiotic Interaction On a single-gene basis, more than 48% of the genes and almost 68% of the exons for which we had evidence of transcription in at least one condition were covered by reads over more than 90% of their length. We defined 37,333 TUs supported by at least two reads in at least one of the four conditions assayed (Table II). The application of a more stringent criterion, a minimum of five reads, decreased this number slightly, to 34,515 TUs, with 29,227 for noninoculated roots and 32,663 for inoculated roots treated with cPTIO. Moreover, in order to assess the agreement between biological replicates for each condition, we calculated the percentage of TUs detected in both replicates at different reads per kilobase per million mapped reads (RPKM) thresholds relative to the total number of detected TUs in either replicate. The results showed a high agreement between replicates, with overlap values ranging from 87.9% to 93.2% at an RPKM threshold of 1. Similar results were found also for higher thresholds (Supplemental Fig. S2).
We investigated whether the differences in gene coverage between RNA sources were due to differences in sample size in terms of mapped reads (ranging from 16.4 million [7.24 + 9.16] for root control to 45.4 million [22.84 + 22.57] for inoculated roots) or to real differences in transcriptome complexity by subsampling and inferring the transcript discovery rate as a function of the number of uniquely mapped reads (Supplemental Fig. S3). The graphs clearly show that, at the depth of sequencing achieved, we have almost reached a plateau for each sample. Furthermore, Spearman's rank correlation analysis on crude counts showed a strong correlation between the two replicates of each condition: 0.90, 0.83, 0.99, and 0.86 for control, inoculated roots, cPTIO-treated inoculated roots, and nodules, respectively.
Expression levels of the 50,962 TUs predicted from version Mt3.0 of the M. truncatula genome and of the 11,573 TUs identified (new TUs aligned with the reference genome and all nonaligned TUs) were determined with ERANGE (Mortazavi et al., 2008) and are expressed in RPKM. The distribution of genes on the  basis of expression level was similar in the four samples ( Fig. 2). In total, 28,744 TUs were identified in at least one condition, with a cutoff value of 2.77 RPKM corresponding to a confidence level of 95% (Table II;  Supplemental Table S8). Most of the TUs (76.5%; 21,967) were common to all four conditions, consistent with previous microarray analyses (Benedito et al., 2008). The expression patterns of a set of genes identified in our RNA-Seq analysis were validated by quantitative real-time (qRT)-PCR (consistent results for 74% of 50 tested genes; Supplemental Table S9). It must be underlined here that the qRT-PCR experiments were performed with the two biological replicates used for the RNA-Seq analysis together with three other biological replicates.
Dynamic Reprogramming of the M. truncatula Transcriptome during Nodulation The 28,744 TUs were grouped by K-means analysis (Sturn et al., 2002) into six clusters differing in terms of their expression dynamics during nodulation (Fig. 3). These clusters differed considerably in size: clusters 4 and 6 each contained only 1% of all the TUs, whereas cluster 1, which contained all the genes stably transcribed in the various conditions, accounted for 80% of the TUs. Genes down-regulated or not expressed in nodules were grouped into clusters 2 (3,461 TUs) and 4 (362 TUs), whereas most of the genes induced in nodules were grouped into clusters 5 (1,023 TUs) and 3 (735 TUs). The last cluster (cluster 6) was composed of 279 TUs induced in both inoculated roots and mature nodules (Supplemental Table S8).
We further explored transcriptional reprogramming during early nodule formation and functioning by calculating Student's t statistics, with a significance threshold of P , 0.001 and a false discovery rate (FDR) of 0.001. In this analysis, in which we selected only transcripts for which expression levels differed by a factor of at least 2 between two conditions (Table III), we identified 2,018 TUs displaying differential expression in inoculated and noninoculated roots and 10,805 TUs displaying differential expression in nodules and noninoculated roots (Supplemental Tables S10 and S11). In the early stage of the interaction (inoculated roots), most (1,541) of the 2,018 differentially expressed TUs were up-regulated ( Fig. 4; Supplemental Table S10), indicating the occurrence of substantial changes during the development of a new organ from the root tissue. Only one-quarter (389) of these TUs continued to be up-regulated in nodules (Fig. 4). By contrast, most (380) of the 477 TUs downregulated in inoculated roots continued to be downregulated in nodules. By contrast to what was observed for inoculated roots, most (8,418) of the 10,805 TUs differentially expressed in nodules were down-regulated ( Fig. 4; Supplemental Table S11).
We compared our RNA-Seq data with data generated in a previous genome-wide analysis of M. truncatula gene expression based upon Affymetrix chip technology (Benedito et al., 2008). We identified a set of 4,697 genes found to be differentially regulated in nodules by both techniques; 1,292 of the 2,387 up-regulated genes were described as up-regulated in 14-and 28-d-old nodules in the M. truncatula Gene Expression Atlas, while 3,405 TUs were found to be down-regulated by both techniques. It should be noted that a low median of expression (,2 RPKM) was observed for the 5,013 TUs not identified as down-regulated in the M. truncatula Gene Expression Atlas. Of the 2,387 up-regulated TUs, 1,998 were identified as induced only at this late stage of symbiosis. We also identified a number of new differentially expressed genes not represented on the current Affymetrix Medicago GeneChip: 37.4% and 30.8% of the 2,018 and 10,805 genes modulated in inoculated roots and nodules, respectively, including several new nodulespecific Cys-rich peptides (annotated as late nodulins: 82 of 252 genes up-regulated in nodules) and transcription factor (TF; 95 of 530 genes induced in nodules) genes. This highlights the greater power of RNA-Seq than of microarrays in terms of sensitivity and ability to discover new genes.
We then analyzed the data set for inoculated roots treated with cPTIO, a scavenger of NO. The efficiency of cPTIO treatment was confirmed by the observed regulation of three genes known to be regulated by NO: those encode a Glu-Cys ligase (Medtr5g010350), which is involved in reduced glutathione synthesis (Innocenti et al., 2007), a putative lipoxygenase (Con-tig3307), which is involved in the jasmonate pathway (Palmieri et al., 2008), and class I hemoglobin (Con-tig525), all of which were repressed by cPTIO. The NO-dependent up-regulation of a class I hemoglobin was also observed in Lotus japonicus during the early stage of nodulation (Shimoda et al., 2005). cPTIO treatment resulted in changes in the expression of 2,030 TUs with respect to the pattern observed in inoculated roots (P , 0.001, FDR = 0.001; Table IV). The most striking finding was the large number (1,242) of TUs down-regulated by this treatment. In total, we identified 415 TUs displaying differential regulation in inoculated versus noninoculated roots and in cPTIO-treated inoculated roots versus inoculated roots (Supplemental Table  S13). For 245 of these TUs, the up-regulation observed in inoculated roots was prevented by NO depletion (Supplemental Table S13). Among the remaining 170 TUs, of which the down-regulation upon inoculation was prevented by NO depletion, genes related to plant defenses were well represented (53 TUs; Supplemental Table S14), indicating that, in contrast to pathogenic interactions, NO repressed defense reactions.
To assess the specificity of the cPTIO effects, the expression levels of defense genes responding to the cPTIO treatment were tested via qRT-PCR in the presence of compounds that are known to be inhibitors of NO synthesis: tungstate, which inhibits nitrate reductase, and L-NG-nitro-arginine methyl ester (L-NAME), which inhibits NO synthase. The obtained results (Supplemental Fig. S4) well validated again the RNA-Seq results and showed that the cPTIO effect could be mimicked by tungstate for a majority of the genes tested, while L-NAME appeared to be ineffective. This indicates that the observed cPTIO effects are linked, at least partially, to NO levels. Furthermore, analyses on transgenic roots with promoter-reporter gene fusions (promoters of the Medtr1g124600 and Medtr7g071380 genes, encoding a glutathione transferase and a chalcone synthase, respectively) confirmed the effect of tungstate on the transcriptional regulation of genes identified as regulated by cPTIO (Supplemental Fig. S5). The MapMan representation tool (http://mapman. gabipd.org) provided an overview of the biotic stress pathway, leading to the identification of major NOdependent modifications (Supplemental Fig. S6), clearly suggesting that NO has a beneficial effect, repressing the defense reactions of the plant to favor establishment of the symbiotic interaction.

DISCUSSION
This work describes the transcriptome landscape of M. truncatula and its expression dynamics at two important stages of the symbiotic interaction with S. meliloti: early organogenesis of the nodules and fully differentiated and functioning nodules.
We used RNA-Seq techniques to generate about 5 Gb of sequence information, corresponding to about 10 times the size of the M. truncatula genome (estimated size of 471-583 Mb [Medicago Genome Sequence Consortium, 2007]). After alignment with the reference genome, the data set of the 74 million uniquely mapped reads provided a rich resource for improving gene annotation across the M. truncatula genome. We identified 7,595 new transcribed regions associated with 5,264 annotated genes. These new transcribed regions contained numerous new exons within introns, new 59 and 39 UTRs, and extensions of known 59 and 39 UTRs, making it possible to better define gene boundaries. Extended 59 UTRs may reflect alternative transcription start sites, as demonstrated in Schizosaccharomyces pombe, in which more than 20 genes display dynamic changes in 59 UTR length during the course of sexual differentiation (Wilhelm et al., 2008). The extension of 39 UTRs may have an impact on gene expression regulation (e.g. if providing microRNA binding sites). We also identified 139 new TUs in intergenic regions that were not described in version Mt3.0 of the M. truncatula genome, more than one-half of which may correspond to noncoding RNAs, as no functional annotation was found in public protein databases (National Center for Biotechnology Information nonredundant database). Finally, we assembled a large number of reads that did not map onto the reference genome into 11,435 TUs, 1,670 of which had not been identified before. These data are currently used, in addition to others, to gain a more comprehensive view of the expressed genome in various organs and physiological conditions and thereby to improve the M. truncatula genome annotation.
Our analysis of gene expression during the symbiotic interaction identified 28,744 genes expressed, in at least one condition, above the selected cutoff value of 2.77 RPKM and confirmed the power of RNA-Seq for the in-depth quantitative analyses of transcriptomes. A K-means clustering analysis showed that 80% of the expressed genes were transcribed to similar levels in the different conditions and revealed major transcriptional reprogramming in fully formed and functional nodules (clusters 2-5). Therefore, we used a stringent Student's t test analysis (P , 0.001, FDR = 0.001) to investigate, in depth, the two steps in the symbiotic interaction studied here (Table IV). A comparison of inoculated and noninoculated roots showed that 2,018 genes were differentially regulated in these two conditions (Supplemental Table S10), most (1,541) being up-regulated during symbiosis. Most of these genes (1,026) did not appear to be differentially regulated in mature nodules (Fig. 4). Our analysis of the genes upregulated in inoculated roots highlights the importance of (1) genes encoding extensins (root nodule extensins), which are abundant in the infection thread lumen (Brewin, 2004); (2) cyclin-like and ribosomal protein-encoding genes, which are involved in the cell cycle and development; and (3) genes involved in redox regulation, consistent with the role of reactive oxygen species and antioxidant defenses in key stages of nodule formation, such as infection thread development and nodule meristem formation (Chang et al., 2009). This analysis also revealed changes in the level of expression of genes not previously identified as involved in nodulation that (1) are involved in protein processing and degradation, highlighting the reorganization of the protein pool occurring during nodulation; (2) encode cytochrome P450s, which may be involved in flavonoid synthesis (Zhang et al., 2007); and (3) encode putative proteins similar to arginase Table III. Differential expression analysis Shown are the number of TUs for which expression levels differed by a factor of at least 2 between two conditions with a significantly different level based on Student's t test and a FDR obtained with the R package qvalue (Dabney et al., 2010) threshold of q , 0.001. In parentheses, the number of up-regulated TUs is given. TUs that have been identified as differentially expressed in an Affymetrix study (Benedito et al., 2008). (Supplemental Fig. S7A), an enzyme that has been reported to hydrolyze Arg to generate urea and Orn in higher plants, Orn being a precursor of polyamines (Chen et al., 2004). The 389 genes that continued to be up-regulated in nodules included many encoding nodulins and early nodulins, such as MtNIN, ENOD40, ENOD20, HAP2.1, MtN1, MtN2, and MtN6 (Crespi et al., 1994;Gamas et al., 1996;Combier et al., 2006;Marsh et al., 2007), highlighting the role of these compounds in nodule development and function. A similar expression profile was found for genes encoding formins, which play an important role in cytokinesis and cell expansion (Blanchoin and Staiger, 2010), and genes encoding proteins involved in steroid metabolism. These findings suggest that cardenolides (Bauer et al., 2010), in addition to their possible role in the synthesis of brassinosteroids, which have been shown to be involved in nodulation (Ferguson et al., 2005), are synthesized from the earliest stages of the interaction.
A large number of genes (10,805) displayed differential expression in nodules and noninoculated roots (Supplemental Table S11), highlighting the major reprogramming triggered by the establishment of a functional symbiosis. However, most (78%) of these genes were down-regulated in nodules. One of the most striking results of the global analysis of the gene families represented (Fig. 5) is the repression of genes encoding chalcone and terpenoid synthases (57 downregulated TUs) and the repression of 95% of the disease resistance genes (310 of 328). Furthermore, consistent with the extensive down-regulation of genes observed at this stage, approximately 70% of the 598 genes encoding TFs and associated regulators of gene expression, the largest group of genes identified, were found to be down-regulated or even switched off in nodules (Table IV). More than 80% of the genes related to auxin, redox state, and sugar transporters (Fig. 5) were also down-regulated. This suggests that hormone and redox balance may be modified in the functioning nodule and indicates that dicarboxylic acids replace sugars as the carbon metabolites transported in this organ (White et al., 2007;Benedito et al., 2010). The large number of genes down-regulated is of particular importance, highlighting the high degree of functional specialization of the nodule.
By contrast, the 22% of induced genes belonged to a large number of gene families. The genes displaying the greatest differences in transcript abundance included those encoding leghemoglobins MtLb1 and MtLb2, nodulins, late nodulins and nodule-specific Cysrich genes, and many genes involved in carbon and nitrogen metabolism, such as Suc, Gln, and Asn synthetases. Many genes encoding sulfate transporters, multiantimicrobial extrusion proteins, and TGF-b receptors type I/II (including nitrate transporters, oligonucleotide transporters, etc.), probably important for exchanges between the two partners, were upregulated in nodules, confirming the importance of transport functions in nodule development and function. The observed up-regulation of nitrate transporters is of particular interest and may reflect the existence of a "nitrate respiration" process (Horchani et al., 2011). Finally, a large number of TF-encoding genes not identified in previous analyses were identified here: these genes included those encoding two CCAAT-binding factors (two HAP3-like), one NIN-like, 17 WRKY, and no less than 92 zinc finger proteins (Supplemental Table  S11). Overall, about 35% of the genes identified by RNA-Seq as displaying differential regulation between nodules or inoculated roots and noninoculated roots are not present on the Affymetrix Medicago GeneChip. Consequently, additional genetic determinants specific to nodule development and functioning, that are reported here to our knowledge for the first time, can now be deciphered.
NO is required for optimal establishment of the M. truncatula-S. meliloti symbiotic interaction (del Giudice et al., 2011). Therefore, we characterized the effects of NO depletion on the nodulation transcriptome during the early phases of the symbiosis by carrying out RNA-Seq analysis on inoculated roots treated with the NO scavenger cPTIO. Surprisingly, NO depletion decreased the number of reads pertaining to 59 and 39 UTRs and new exons from that observed in inoculated roots and nodules to that for noninoculated roots, indicating a probable role for NO in decreasing transcriptome complexity during nodulation that merits further investigation  Table S2). A comparison of NOdepleted inoculated roots and inoculated roots revealed differential patterns of expression for 2,030 genes (Table  III; Supplemental Table S12). The most striking finding was the large number of genes from several gene families for which NO depletion prevented the up-regulation induced by inoculation with the microsymbiont (Fig. 6).
These genes included many encoding TFs and several genes encoding proteins involved in nodule development, such as cyclin-like proteins, peptidase, or ribosomal protein families, consistent with the reported effect of NO on the nodulation process (del Giudice et al., 2011). Another interesting finding is the failure of NO-depleted M. truncatula roots to trigger the expression of 11 of the 14 members of the plant lipid transfer protein gene family induced early in nodulation, as the proteins encoded by these genes may participate in the control of bacterial infection and the autoinhibitory regulation of nodule numbers by the host plant (Pii et al., 2009). NO depletion also triggered major changes in hormone signaling, as suggested by the lack of induction of several genes encoding auxin-responsive proteins (Supplemental Fig. S6), and resulted in a lack of activation of genes encoding proteins involved in the cell cycle and protein synthesis, both of which are required for the dedifferentiation of cortical cells and the induction of cell division during nodule formation. Thus, a large number of transcriptional regulators appear to be down-regulated by cPTIO treatment (Supplemental Table S12). One HAP3-like (Medtr1g099630.1) gene appeared to be regulated by NO, possibly accounting in part for the impairment of nodulation by cPTIO (del Giudice et al., 2011). Other TF families known to be involved in nodulation (such as the NIN-like proteins) were not affected, highlighting the specificity of the effect of NO on TF regulation. The genes for which down-regulation in inoculated roots was prevented by NO depletion (Supplemental Table S13) included the gene encoding the NADPH oxidase RBOHC and a large number of genes encoding Tyr kinases, PR proteins, glutathione S-transferases and chalcone synthases, and genes encoding proteins involved in terpene, flavonoid, and phenylpropanoid pathways ( Fig. 6; Supplemental Fig.  S6C). These findings clearly suggest that NO plays a crucial role in repressing the defense reaction in symbiotic conditions, thereby favoring establishment of the beneficial plant-microbe interaction. This action differs markedly from the signaling functions of NO in pathogenic interactions, in which NO cooperates with reactive oxygen species to induce hypersensitive cell death and functions independently of such intermediates in the induction of defense-related genes (Delledonne et al., 1998). This hypothesis was confirmed, at least partly, by the experiments performed with tungstate; it must be noted here that L-NAME, which inhibits NO synthase-like activity, did not affect the expression level of the defense genes. Taken together, these results shed new light on the possible role of nitrate reductase in the synthesis of NO during the first steps of the nodulation process. Finally, we assessed the complexity in the nodulation transcriptome by determining the frequency of the various alternatively spliced forms. We identified 23,164 different transcripts derived from 6,587 genes. Although originally thought to be less frequent in plants than in animals, AS is now known to be widespread in plants (Reddy et al., 2012). A recent analysis using RNA-Seq has revealed that over 40% of introncontaining genes in Arabidopsis (Arabidopsis thaliana; Filichkin et al., 2010) and about 48% in rice (Oryza sativa; Lu et al., 2010) undergo AS. Concerning legumes, the only data available are based on EST alignments: the observed frequency of alternatively spliced genes was described to be lower in M. truncatula (approximately 10%) and L. japonicus (approximately 3%) than in Arabidopsis and rice; however, AS frequencies appeared comparable in all four species when EST levels were normalized (Wang et al., 2008). Although it is not known how much of AS is due to noise in the splicing process (Melamud and Moult, 2009), the estimated extent of AS frequency in M. truncatula obtained in this study (6,587 genes) appears significantly higher than the one (1,107 genes) reported previously, based on EST alignments (Wang et al., 2008). This underscores the superior detection capability of the RNA-Seq method over the conventional EST approach, especially for low-abundance transcript variants (Filichkin et al., 2010). Therefore, we concluded that AS may play an important role in increasing the diversity of the M. truncatula proteome. Comparisons of inoculated roots or 21-d-old nodules with noninoculated roots led to the identification of 63 and 65 genes, respectively, 29 of which were common to the two comparisons. Expression analyses showed that several AS events were dynamically modulated, indicating the association of AS events with a specific phase of the nodulation process, confirming recent reports that many splice variants are differentially expressed in a development-specific manner or in Figure 7. Transcript abundances of multiple isoform genes regulated during the nodulation process and/or the cPTIO treatment. A, Gene models from the M. truncatula reference genome (version Mt3.5) showing the structure of the isoforms with the positions of the exons, introns, and UTR sequences. B, Relative abundance of transcript isoforms measured as a function of the condition in RPKM. In the case of the 1-aminocyclopropane-1-carboxylate (ACC) oxidase, we observed one unknown isoform (Medtr5g085330.1 i) in which part of an intron is conserved. Inoc., Inoculated roots; Inoc + cPTIO, inoculated roots treated for 8 h with cPTIO. response to developmental cues (Reddy et al., 2012). The alternatively spliced forms included four isoforms from a gene encoding a potential 1-aminocyclopropane-1-carboxylate oxidase (Medtr5g085330) involved in ethylene biosynthesis. Three of these isoforms were already known, and the fourth corresponded to an intron retention event. The major isoforms expressed during the nodulation were isoforms 2 and 3 rather than isoform 1 (Fig. 7). For the ATP-dependent Clp protease (Medtr4g069800), isoform 1, which was expressed in noninoculated roots, was totally repressed during symbiotic interaction, whereas the other two isoforms were induced (Fig. 7). Many switches between major and minor transcript isoforms were observed in our analysis, suggesting a probable role of AS in the adaptive changes observed during nodulation. Moreover, cPTIO treatment enhanced splicing (148 genes). For example, changes to the major isoform of dynaminrelated protein1C were blocked in NO-depleted inoculated roots (Fig. 7). cPTIO treatment seemed to block the repression of isoform 1 selectively, without affecting the other isoforms, in the case of a gene encoding a thioredoxin-like protein (Medtr7g093490; Fig. 7). These two examples suggest that NO may exert fine control over transcriptome complexity and expression dynamics.

CONCLUSION
In conclusion, although many genes differentially expressed during nodulation have been identified in previous studies, this investigation of the transcriptome of M. truncatula during establishment of the symbiotic interaction has considerably refined and expanded our knowledge of the basic building blocks required for the colonization and correct functioning of the nodule. The resulting view of the transcriptome, at single-base resolution, markedly improves our understanding of expression dynamics during the symbiotic process and has major biological implications that can now be evaluated experimentally.

Growth Conditions and Isolation of Total RNA
Sinorhizobium meliloti strain RCR2011 (Rosenberg et al., 1981) was grown in Luria-Bertani medium supplemented with 2.5 mM CaCl 2 and 2.5 mM MgSO 4 . Streptomycin at 100 to 300 mg mL 21 was added when required. Seeds of Medicago truncatula 'Jemalong A17' were scarified with H 2 SO 4 , surface sterilized in a bleach solution, rinsed with sterile distilled water, germinated on agar plates for 3 d in the dark, and allowed to grow on nitrogen-free Fahraeus medium plates (covered with pouch paper) during 8 or 21 d in a culture room (22°C-25°C) with a light/dark photoperiod of 16 h/8 h (http://www.noble. org/MedicagoHandbook/). Four days after the germinated seeds were transferred onto medium plates, the plants were inoculated with 200 mL per root of S. meliloti grown in liquid Luria-Bertani medium supplemented with 2.5 mM CaCl 2 and 2.5 mM MgSO 4 , pelleted at 4,500 rpm, washed twice with sterile distilled water, and resuspended in sterile distilled water to a final optical density at 600 nm of 0.01 (corresponding to an inoculum concentration of approximately 10 5 to 10 6 bacteria cells per mL). The control condition was inoculation with 200 mL of sterile distilled water. Four days post inoculation, some of the plants were treated with 250 mL of a 1 mM solution of cPTIO (Sigma) prepared in 10 mM Tris-KCl buffer that was added along the whole length of the roots. Plants were allowed to stay in contact with cPTIO for 8 h. For the inoculated root condition, roots were treated with 250 mL of 10 mM Tris-KCl buffer for 8 h. Treatments with L-NAME (Sigma) and tungstate (Fluka) were done following the same protocol using 5 and 2 mM, respectively, of each inhibitor of NO synthesis. Furthermore, in order to obtain a higher concentration of RNA from nodule primordia, a 2-cm region of the root in which most of the nodule primordia appear was harvested and immediately frozen in liquid nitrogen. To localize the root infection zone, we marked the position of the primary root apex on the day of the inoculation. Four days later, the infection zone was mainly located around this mark. For the nodule condition, plants grew on a medium plate for 21 d. Nodules were then harvested and immediately frozen in liquid nitrogen. A total of 100 mg of plant material was ground in liquid nitrogen, and total RNA was isolated using QIAshredder and RNeasy Mini kit columns (Qiagen; www.qiagen.com). The optional DNase treatment step was added to the protocol using the RNasefree DNase set (Qiagen). The integrity of total RNA was checked on a 2100 Bioanalyzer (Agilent; www.agilent.com), and its quantity as well as purity were determined with a Nanodrop 2000 apparatus (Thermo Scientific; www. thermofisher.com).

qRT-PCR Validation
For the qRT-PCR validation, cDNAs were made from the RNA isolated either by RNeasy Mini kit columns (as described previously) or Trizol reagent (Invitrogen; www.invitrogen.com). A total of 100 to 200 mg of plant material (root) was ground in liquid nitrogen, and total RNA was isolated using Trizol reagent. The integrity of total RNA was checked on agarose gels, and its quantity as well as purity were determined spectrophotometrically. A total of 500 to 1,500 ng of RNA was used as a template for reverse transcription reaction in a 20-mL reaction volume using the Omniscript RT Kit (Qiagen; www. qiagen.com). Real-time qRT-PCR was carried out using the qPCR Mastermix Plus for SYBR Green I reagent (Eurogentec; www.eurogentec.com). Reactions were run on the Chromo4 Real-Time PCR Detection System (Bio-Rad; www. bio-rad.com), and quantification was performed with the Opticon Monitor analysis software version 3.1 (Bio-Rad). Data were quantified using Opticon Monitor 2 (MJ Research) and analyzed with RqPCRBase, an R package working on the R computing environment for analysis of quantitative realtime PCR data (T. Tran and F. Hilliou, unpublished data). The mRNA levels were normalized against three constitutively expressed endogenous genes, Mtc27 (TC106535; Van de Velde et al., 2006) and two genes (Mtr.10324.1.S1_at and Mtr.31250.1.S1_at) selected as reference genes (del Giudice et al., 2011). PCR for each biological replicate was performed in three technical replicates. For each reaction, 5 mL of 100-fold-diluted cDNA and 0.3 mM primers were used. The initial denaturing time was 10 min, followed by 40 PCR cycles at 95°C for 10 s and 60°C for 1 min. The specificity of the amplification was confirmed by a single peak in a dissociation curve at the end of the PCR procedure. For each experiment, the stability of the reference genes across samples was tested using geNorm software (Vandesompele et al., 2002). The absence of genomic DNA contaminations in the RNA samples was tested by PCR analysis of all samples using oligonucleotides bordering an intron in the glutathione synthetase gene of M. truncatula.
The gene-specific primers used are listed in Supplemental Table S15.

Library Preparation and Sequencing
Poly(A) mRNA was isolated from the extracted RNA to prepare a nondirectional Illumina RNA-Seq library with the mRNA-SEquation 8 sample prep kit (Illumina). We modified the gel extraction step by dissolving excised gel slices at room temperature to avoid underrepresentation of AT-rich sequences (Quail et al., 2008). Library quality control and quantification were performed with a Bioanalyzer Chip DNA 1000 series II (Agilent). Each library had an insert size of 200 bp, and 36-to 44-bp sequences were generated on an Illumina genome analyzer II. The reads were distributed within the four conditions as follows: 16.4 million for the root control, 45.4 million for inoculated roots, 46.6 million for inoculated roots treated with cPTIO, and 27 million for nodules (Supplemental Table S1).

Mapping and Analysis of Illumina Reads
M. truncatula genome and splice database sequence alignments were generated with Bowtie (Langmead et al., 2009). Bowtie supports up to three mismatches, reports reads aligning to multiple locations in the genome, is able to assess the quality of the sequences generated, and its output can be directly parsed by the ERANGE package (Mortazavi et al., 2008) that was used for differential expression analysis. For our analysis, we allowed up to two mismatches (parameter "-v 2"). The sequences were discarded if they aligned on more than 10 different loci (parameter "-m 10"). The program reported all the matches it was able to find for the remaining reads (parameter "-k 11"). Alignment of the reads was made on the Mt3.0 version of the M. truncatula genome sequence (www.medicago.org), and the output was processed using the software suite BedTools (http://code.google.com/p/bedtools/; Quinlan and Hall, 2010) in order to assign each read to an exon, intron, UTR, or intergenic region. Reads mapped onto external exons fell within a 3-kb catchment from both ends of a gene, promoting the investigation of putative undiscovered exons. Intergenic reads represented those sequence reads that fell outside this catchment. The program ERANGE defined potentially novel clusters of expression on the basis of their alignment; they were categorized as novel sections (exons/UTRs) of a known gene if they fell inside a radius of 3,000 bp from them (average gene density/2). The remaining expressed clusters were marked as potential new genes.
In order to identify potential new isoforms of known genes, we remapped all reads against the M. truncatula genome using TopHat (Trapnell et al., 2010), with a segment length of 16 due to the short length of our reads, and defined the new isoforms of known genes by performing a Cufflinks analysis on each sample, with standard parameters, followed by an analysis with Cuffcompare to merge transcripts identified on different samples (Trapnell et al., 2010). We used the latest genome sequence and annotations provided by the M. truncatula research community (Mt3.5; http://www.medicago.org/). The Mt3.0 version consists of essentially the same sequence data found in the more recent Mt3.5 version, except that the assembly of Mt3.5 was based on newer optical map results (Branca et al., 2011;Young et al., 2011). Therefore, we used the Mt3.5 version only for the identification of AS events.

Clustering
The expression data of the 28,744 genes expressed (in RPKM) were transformed with the software Expression Profiler from the European Bioinformatics Institute (www.ebi.ac.uk) to get a relative expression ratio (in log 2 base) around the gene's average value. The clustering was made following the K-means method with Pearson's correlation distance using the Genesis bioinformatic tool (Sturn et al., 2002; http://genome.tugraz.at). Different numbers of clusters were tested to identify the minimum number of clusters to represent all the separated expression profiles.

Identification of Novel Transcribed Regions
We used the reads that had not been mapped against the Mt3.0 sequence from every sample to assemble separately our contigs, using the Velvet program (Zerbino and Birney, 2008), using a sensitive hash length of 29 for the reads with a length of 44 bp and of 21 for the rest. The contigs were subsequently clustered together using the software CAP3 (Huang and Madan, 1999), with a minimum overlap of 90%, requiring an overlap identity of 80%. Contigs mapping against the reference genome with identity of 90% or greater and coverage of 90% or greater after BLAT alignment were discarded from further analysis. All the contigs were also mapped against the accompanying RNA-Seq data of the Mt3.5 version with GMAP (version 2012-04-21). The contigs, with an alignment coverage on the sequence length of 90% or greater and on the identity of 90% or greater, were merged together using the program mergeBed from the BEDTools suite (http://code.google.com/p/bedtools/).

Gene Expression Analysis
The evaluation of gene expression was performed with the ERANGE software (version 3.1) available at http://woldlab.caltech.edu/RNA-Seq (Mortazavi et al., 2008). ERANGE requires Cistematic 2.5 to execute Run-Standardanalysis.sh. Therefore, a Python script (medicago3.py) was developed to import the M. truncatula reference sequence (Mt3.0) and gff annotation into Cistematics Genomes, and a Perl script (gff2knowngene.pl; Zenoni et al., 2010) was used to convert the gff annotation file to the knowngene.txt file used by RunStandardanalysis.sh. ERANGE reports the number of mapped reads per kilobase of exon per million mapped reads, measuring the transcriptional activity for each gene. To obtain an accurate measure of gene expression not biased by reads mapping to splice junctions in genes with many introns, ERANGE considers both reads mapping to the genome or to the custom splice junctions database. ERANGE was preferred over commercial packages such as CASAVA and the GenomeStudio platform from Illumina because of its open nature. This allowed us to adapt and reuse code for our own analysis with greater flexibility than a comparable closed-source commercial package.

Differential Gene Expression Statistics for RNA-Seq
ERANGE software computes the normalized gene locus expression level (named RPKM) by assigning reads to their site of origin and counting them. In the case of reads that match equally well to several sites, ERANGE assigns them proportionally to their most likely site(s) of origin (Mortazavi et al., 2008). The number of reads falling in a given gene locus can be estimated from the RPKM value as follows: n = RPKM 3 L 3 N Tot 3 10 29 , where n = number of mapping reads at a given gene locus, L = estimated length (bp) of the gene locus, N Tot = number of total mapping reads, and RPKM = gene locus RPKM value. The null hypothesis of no differential gene expression for each gene was tested using the R package qvalue (Storey, 2002;Storey and Tibshirani, 2003;Dabney et al., 2010) on the R working environment. FDRs were calculated based on P values obtained running Student's t test on the raw read counts using the basic R package stats.
The threshold value for the FDR was 0.001, and genes were first selected using this filter. Differentially expressed genes were then filtered again based on a fold change greater than 2.
All Illumina sequence data have been deposited in the National Center for Biotechnology Information short-read archive, and Sanger-sequenced PCR products have been deposited in GenBank (accession no. SRA048731). Assembled contigs longer than 200 bp have been deposited at Transcriptome Shotgun Assembly (accession nos. JR366937-JR375780). Coverage data are available at http://ddlab.sci.univr.it/cgi-bin/gbrowse/medicago/.

Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. Photographs of the harvested plants at 4 d post inoculation.
Supplemental Figure S2. Percentage of TUs detected in both replicates at different RPKM thresholds.
Supplemental Figure S3. Number of genes detected with five or more uniquely mapped reads as a function of the sequencing depth.
Supplemental Figure S4. Comparison of cPTIO and two different NO synthesis inhibitors treatments on the transcript level on six genes.
Supplemental Figure S6. Overview of biotic stress pathway regulated during the early organogenesis and by the cPTIO treatment using MapMan.
Supplemental Figure S7. Overview of general metabolism regulated by the cPTIO treatment during the early organogenesis using MapMan.
Supplemental Table S1. Number of reads obtained for each sample in the experiment.
Supplemental Table S2. List of new transcribed regions identified.
Supplemental Table S3. TUs newly identified mapped on the Mt3.0 genome version (named FAR).
Supplemental Table S4. Cuff tracking data of all transcripts from Cufflinks analysis.
Supplemental Table S8. TUs expressed in the different condition above 2.77 RPKM.
Supplemental Table S9. Validation of RNA-Seq expression results by qRT-PCR.
Supplemental Table S10. TUs differentially expressed in inoculated roots compared with noninoculated roots.
Supplemental Table S11. TUs differentially expressed in nodules compared with noninoculated roots.
Supplemental Table S12. TUs differentially expressed in cPTIO-treated inoculated roots compared with inoculated roots.
Supplemental Table S13. TUs differentially expressed either by inoculation or cPTIO treatment.
Supplemental Table S14. Differential expression analysis of defense genes families affected by cPTIO treatment.
Supplemental Table S15. Primer sequences used for real-time qPCR analysis.