Potential Functional Replacement of the Plastidic accD gene by Recent Transfers to the Nucleus in some Angiosperm Lineages 1 .

Eukaryotic cells originated when an ancestor of the nucleated cell engulfed bacterial endosymbionts that gradually evolved into the mitochondrion and the chloroplast. Soon after these endosymbiotic events, thousands of ancestral prokaryotic genes were functionally transferred from the endosymbionts to the nucleus. This process of functional gene relocation, now rare in eukaryotes, continues in angiosperms. In this article, we show that the chloroplastic accD gene that is present in the plastome of most angiosperms has been functionally relocated to the nucleus in the Campanulaceae. Surprisingly, the nucleus-encoded accD transcript is considerably smaller than the plastidic version, consisting of little more than the carboxylase domain of the plastidic accD gene fused to a coding region encoding a plastid targeting peptide. We verified experimentally the presence of a chloroplastic transit peptide by showing that the product of the nuclear accD fused to GFP was imported in the chloroplasts. The nuclear gene regulatory elements that enabled the erstwhile plastidic gene to become functional in the nuclear genome were identified and the evolution of the intronic and exonic sequences in the nucleus is described. Relocation and truncation of the accD gene is a remarkable example of the processes underpinning endosymbiotic evolution.


INTRODUCTION
Photosynthetic eukaryotes arose more than a billion years ago through the endosymbiotic association of an alpha proteobacterium (Margulis 1970;Gray et al. 1999) and a cyanobacterium with the progenitor of the nucleated cell (Mereschkowsky 1905;Goksoyr 1967;Deusch et al. 2008). These proteobacterial and cyanobacterial endosymbionts subsequently evolved into mitochondria and chloroplasts, respectively. This transition from endosymbionts to integrated cytoplasmic organelles involved the loss of non-essential or redundant bacterial genes, the creation of protein import machinery and extensive functional relocation of genes from the organelle ancestors to the nuclear genome. Consequently modern cytoplasmic organellar genomes are much smaller in size compared with their prokaryotic ancestors, even though the spectrum of proteins required for function and biogenesis is not substantially different (Timmis et al. 2004). As an example, the human mitochondrial genome encodes only 37 genes and most flowering plant plastomes encode only ~120 genes compared with several thousand genes in the proposed extant relatives of their bacterial ancestors (Timmis et al. 2004).
The merging of two genomes from different lineages through endosymbiosis did not only permit the functional relocation of ancestral organellar genes to the nucleus. It significantly contributed to eukaryote evolution and adaptation to new ecological niches by combining the different biochemical capabilities encoded by each genome and by providing, through endosymbiotic DNA transfer, a continuous rich source of genetic diversity, new genes, exons, introns and gene regulatory elements (Martin et al. 2002;Martin and Koonin 2006;Noutsos et al. 2007). These transfers of DNA from the organelles to the nucleus still continue to occur at a surprisingly high frequency ( Thorsness and Fox 1990;Huang et al. 2003;Stegemann et al. 2003;Sheppard et al. 2008). The nuclear copies of extant organelle DNA are referred to as norgs -nuclear integrants of organelle DNA (Leister 2005) and they can be further classified by their cytoplasmic organelle origin as either numts -nuclear integrants of mitochondrial DNA (Lopez et al. 1994) or nupts -nuclear integrants of plastid DNA (Timmis et al. 2004). The fate of nupts is variable with some being lost within a single generation (Sheppard and Timmis 2009) while others remain in the nucleus for millions of years and evolve neutrally in a nucleus specific manner (Rousseau-Gueutin et al. 2011a;Rousseau-Gueutin et al. 2012).
Organellar genes are usually non-functional after transfer to the nuclear environment, as they require the acquisition of nuclear gene regulatory elements to become active and a target peptideencoding sequence if the protein is to be targeted back to the organelle. This process, which has occurred over long evolutionary time periods, has been partially reconstructed experimentally and some of the molecular mechanisms responsible for these rare events have been described (Stegemann and Bock 2006;Lloyd and Timmis 2011;Fuentes et al. 2012). Presumably the activity of organellar genes functionally transferred to the nucleus will be duplicated for a certain period of time with functional genes existing in both genetic compartments of the cell, until one becomes defunct by chance mutation (Adams et al. 1999). In the case of the functional transfer of a chloroplastic gene to the nucleus, the retention of the plastidic copy is usually favoured (Rousseau-Gueutin et al. 2012).
However, functional gene transfer can occur repeatedly and eventually the loss of functionality of the plastidic copy results in permanent nuclear residence since no reciprocal exchange of genetic material between the plastome and the nucleus has ever been observed. This one-way mechanism is referred to as a "gene ratchet" (Doolittle 1998).
In animals, the functional relocation of mitochondrial genes to the nucleus appears to have stopped, but it is still occurring in plants, particularly in the angiosperms where the molecular mechanisms of activation and further evolution are uniquely amenable to study. Most of the discovered recent functional gene relocations in angiosperms have involved transfer of mitochondrial genes to the nucleus (Liu et al. 2009), with only a few plastid examples reported (Gantt et al. 1991;Millen et al. 2001;Cusack and Wolfe 2007;Ueda et al. 2007;Magee et al. 2010;Rousseau-Gueutin et al. 2011b).
However, with the recent availability of more than a hundred angiosperm plastome sequences, it has become apparent that several genes have been lost recently in various fully photosynthetically

2008) and
Poaceae (Konishi and Sasaki 1994;Martin et al. 1998). In those species, it is expected that an alternative version of accD of eukaryotic or prokaryotic origin will exist in their nuclear genomes to carry out fatty acid biosynthesis in the chloroplast since knock-out experiments of pt-accD in tobacco showed that it is an essential gene (Kode et al. 2005). In addition, several lines of evidence suggest that expression of accD is indispensable during embryo development in Arabidopsis (Bryant et al. 2010).
In the Campanulaceae, which lack accD from their plastome, it was observed that prokaryotic ACCase proteins were nevertheless still present in protein extracts from chloroplasts, as in all flowering plants except the Poaceae family (Konishi et al. 1996). These results indicate that chloroplastic accD must have been functionally transferred from the chloroplast to the nucleus in that family.
Here, we report the identification in Trachelium caeruleum (Campanulaceae) of a chimeric nuclear accD (n-accD) of chloroplast origin that encodes an abridged version of the protein. The entire n-accD transcript encodes only a target peptide fused to the carboxylase domain of the plastidic accD gene.
Evidence is provided to show that this nuclear gene has functionally replaced the plastidic gene in T.
caeruleum. We also provide substantial insights into the acquisition of functionality of the plastidic gene in the nuclear genome and on its subsequent nuclear evolution in this new genetic compartment.
Finally, we discuss the genetic changes that may have facilitated its loss from the plastome and its functional relocation to the nucleus in a few plant families during angiosperm evolution.

RESULTS
AccD has been functionally transferred from the chloroplast to the nucleus in T. caeruleum by acquiring nuclear gene regulatory elements A comparison of approximately a hundred angiosperm plastome sequences showed that accD was defunct and often completely missing in species belonging to the Acoraceae, Campanulaceae, Fabaceae, Geraniaceae (two independent losses) and Poaceae, suggesting at least six independent losses of the plastidic accD gene, consistent with previous reports (Jansen et al. 2007). Since knockout experiments in N. tabacum have shown that it is an essential gene (Kode et al. 2005), it is likely that accD has been functionally replaced by an eukaryotic or prokaryotic-like version in the nuclear genome. In a study of the presence and absence of a prokaryotic type and a eukaryotype type of acetyl-coA carboxylase in 28 plant families (including Campanulaceae), it was observed that all plant families (with the exception of Poaceae) contained a prokaryotic ACCase in the protein extracts of plastids (Konishi et al. 1996). However, from the comparison of 23 Asterid plastome sequences, it was observed that the plastidic accD gene was missing in the Campanulaceae Trachelium caeruleum (Haberle et al. 2008). Its absence in the plastome of a Campanulaceae species (T. caeruleum) and its presence in the closely related Asteraceae species (Guizotia abyssinica, Helianthus annuus and Lactuca sativa) suggest a relatively recent loss of this plastid gene following the divergence of these two families. The presence of a prokaryotic type ACCase in the protein extracts of Campanulaceae plastids (Konishi et al. 1996) further indicate that the accD gene must have been functionally transferred from the chloroplast to the nucleus in that family.
Comparison of plastid accD (pt-accD) sequences (Fig. 1A) indicated that the last 250 amino-acids encoded by this gene are highly conserved among Asterids. This C-terminal region encodes a carboxylase domain, which is the only known functional domain of the ACCD protein (Zhang et al. 2003). One of the primer pairs designed to part of this conserved region amplified a transcribed sequence using poly-A primed cDNA from T. caeruleum (Fig. 1). The sequence of this 128 bp product does not correspond to any region of T. caeruleum plastome and shows 83-84% nucleotide identity to a region of the carboxylase domain of Asteraceae pt-accD genes. This sequence data suggests that a nupt encoding part of accD gene is actively transcribed and polyadenylated in T. caeruleum. The entire sequence of this putative n-accD transcript was obtained by RACE-PCR. The putative n-accD gene encodes a protein of 331 amino acids, compared with the approximately 500 amino-acids encoded by the accD gene in the plastomes of Asterids (Fig. 2). The paucity of nuclear sequence data for Asterids precludes unequivocal characterisation of the border between plastid-like and pre-existing nuclear sequence in this transcript. However, we found that 75 residues at the N-terminus of the encoded protein show low similarity (e-value ≈ 10 -6 ; 40% amino-acid sequence similarity) with the middle of the intron-less 3-ketoacyl-acyl carrier protein synthase I (KAS I) gene from Vitis vinifera and that the 235 amino-acids encoded at the 3'end of the T. caeruleum n-accD transcript are 69% similar to the accD carboxylase domain encoded by Asteraceae pt-accD genes.
To become functional in the nucleus, the chloroplast-derived accD gene must acquire a promoter, a transit peptide-encoding sequence to import the cytoplasmic protein back into the chloroplast and A chloroplastic target peptide-encoding sequence was predicted at the 5'end of the nuclear protein (Table SI)  To verify experimentally the existence of the predicted transit peptide, the sequence encoding the majority of the n-ACCD protein was fused in-frame to a Green Fluorescent Protein gene (GFP) and the subcellular location of the n-ACCD-GFP proteins was determined. A control construct without any accD sequence (P35S:GFP) was used to verify that no vector sequences adjacent to the GFP gene could encode a cryptic target peptide. Stable tobacco lines transformed with the control construct showed GFP fluorescence in the cytoplasm of leaf guard cells, whereas transgenic plants expressing the fusion construct showed clear plastid localised GFP fluorescence (Fig. 3). These results confirm that the amino acid sequence at the N-terminus of n-ACCD acts as a chloroplastic target peptide.

accD has been been functionally transferred to the nucleus in Campanulaceae species
Plastome sequencing studies have revealed that pt-accD is present in three Asteraceae species (Guizotia abyssinica, Helianthus annuus and Lactuca sativa) but absent in T. caeruleum (Campanulaceae) plastome due to DNA rearragements (Timme et al. 2007;Haberle et al. 2008;Dempewolf et al. 2010). To estimate more precisely the evolutionary timing of the loss of this pt-accD gene, slot blot hybridization of total cellular DNA was undertaken from 18 Asterales species (Fig. 4).
This method allows inference of gene location because of the high copy number of plastomes compared with the low-copy number of nuclear genomes per cell. Thus a strong signal is obtained if the gene is located in the chloroplast whereas no (or weak) signal is obtained if the gene is in the nucleus. The psbA (ubiquitous plastidic gene) and pt-accD probes produced similar level of hybridization for both probes in all the Goodeniaceae and Asteraceae species investigated, whereas little hybridization of the pt-accD gene compared to psbA was detected in any Campanulaceae tested.
These data suggest that pt-accD was lost from the plastome of these latter species near the time of divergence of the Asteraceae and Campanulaceae. A very weak but still significant pt-accD hybridization (10% of hybridization signal) was observed in C. alliariifolia. However its origin was not further investigated since it was considered as a pseudogene.
The loss of pt-accD in the Campanulaceae implies the presence of a functional nuclear copy since the gene is essential (Kode et al. 2005;Bryant et al. 2010). Among Asterales species that still contain a functional pt-accD, it is possible that some may also express a functional n-accD if evolutionary relocation is at an intermediate stage. To investigate this further RT-PCR with primers specific for a plastidic and a nuclear copy of accD were produced (Fig. 5A). RT-PCR was undertaken on leaf mRNA from the same 18 Asterales species (14 Campanulaceae, three Asteraceae and one Goodeniaceae) that were used in the Slot-Blot (Fig. 4). All species that lacked pt-accD produced nuclear transcripts (Fig. 5B), apart from Campanula alliariifolia. In all the Campanulaceae species, consistent with the slot-blot results, no pt-accD transcript was identified, confirming the loss of the plastidic gene in all these species. A nuclear transcript was subsequently amplified from C. alliariifolia using an alternative primer pair and the rearrangements that caused the loss of pt-accD in T. caeruleum (Haberle et al. 2008) were also confirmed in C. alliarifolia.
N-accD transcripts from ten Campanulaceae species (A. bulleyana, A. lilifolia, C. carpatica, C. punctata, C. thyrsoides, C. trachelium, J. montana, J. perennis, L. erinus and T. caeruleum) were sequenced and all contained an intact ORF. Seven species had a n-accD transcript of an identical size, while deletions of three, six (two deletion events) and 60 bp (two deletion events) were present in T. caeruleum, L. erinus and the two Jasione species, respectively. These deletions occurred towards the 5'end of n-accD transcript, outside the plastid derived sequence (Fig. 6). The greatest conservation of these n-accD transcript sequences was found in the accD enzymatic domain encoded by the nupt sequence, emulating conservation of the 3' region of pt-accD that also encodes the carboxylase functional domain.
Pairwise analyses of non-synonymous (K a ) and synonymous nucleotide substitution (K s ) rates were undertaken to determine if n-accD was under positive selection. In the Campanulaceae species analysed, nuclear accD showed a K a ranging from 0.13 to 0.15 and a K s ranging from 1.01-1.55. These values are in accordance with the observation that the time of functional transfer of accD was close to the formation of the Campanulaceae. Pairwise K a /K s ratios suggested that the n-accD gene was not positively selected, as is nearly always observed for genes functionally transferred from an organelle to the nucleus (Liu et al. 2009). Indeed, from the study of more than a hundred organellar genes transferred to the nucleus in various angiosperms (including the multiple transfers of 11 genes in several lineages) it was observed that only 1% of genes in pairwise comparisons showed evidence of positive selection (Liu et al. 2009).
The Asteraceae and Goodeniaceae species investigated in this study that were shown to possess a pt-accD gene by Slot Blot hybridisation were subsequently shown to produce pt-accD transcripts (Fig.   5B). The only exception was Scaveola albida, for which the absence of a transcript is presumed to be due to sequence divergence with the primer pairs since the plastidic gene was shown to be present on the Slot-Blot. In Tagetes, the functionality of the gene was further verified by sequencing pt-accD and by observing the presence of an intact ORF. None of the Asteraceae species used in this study contained both plastidic and nuclear transcripts and thus were not at an intermediate stage of functional relocation of the accD gene to the nucleus. Numerous attempts to amplify a whole or partial n-accD in those species were unsuccessful.

Independent functional relocation of pt-accD in a few angiosperm families
Currently available plastome data reveal the loss of a functional pt-accD in species belonging to five different angiosperm families. In the Campanulaceae family and more specifically in Trachelium caeruleum, our work shows that pt-accD appears to have been functionally transferred to the nucleus and replaced by a nucleus-encoded version of prokaryotic origin. The n-accD ORF identified in T.
caeruleum encodes approximately 200 amino acids of plastid origin, whereas pt-accD encodes a 500 amino-acids protein in the closely related Asteraceae species. Recently ESTs corresponding to another putative n-accD gene of plastid origin have been discovered in Trifolium repens (Fabaceae) (Magee et al. 2010). Interestingly, both Trachelium and Trifolium n-accD genes each encode only the 3' end region of the plastidic gene, which corresponds to the carboxyl transferase domain of the ACCD protein (Fig. 7). This is the only known functional domain present in this protein (Zhang et al. 2003).
Despite this common feature, these two n-accD genes have distinct gene structures and have acquired different nuclear regions to promote the nuclear expression and chloroplast targeting of the gene product, implying the occurrence of two different functional accD transfer events in the Fabaceae and Campanulaceae.

The intronic region in the nuclear accD is highly variable
Introns sometimes play a role in the regulation of gene expression (Sheldon et al. 2002;Schauer et al. 2009). To determine if the 1.4 kb intron present in the T. caeruleum n-accD transcript may have a role in the gene expression, we sequenced the intronic region from three other Campanulaceae species (C. punctata, C. thyrsoides and J. perennis). The intron was present at an identical position in this gene in all four species examined, suggesting that it was relocated early in the Campanulaceae lineage.
However, its size is variable, ranging from 1,357 bp in T. caeruleum to 2,430 bp in C. punctata. None of these intronic sequences are highly conserved between the four species (Fig. S1), suggesting an absence of conserved regulatory elements. were shown to have lost lost chloroplastic accD from their plastomes. In this study, we provide evidence that members of the Campanulaceae family have functionally transferred the chloroplastic accD gene to the nucleus prior to its loss from the plastome.

Acetyl
It has been hypothesized that accD is retained in the plastome to allow each plastid to control ACCase activity according to its needs (Bungard 2004) because this enzyme is a limiting factor in fatty acid biosynthesis (Madoka et al. 2002). However, the absence of a plastidic version in T.
caeruleum (and other angiosperms) and the identification of a functional accD gene (of prokaryotic origin) encoded in the nucleus of that species is not consistent with a plastid autonomous regulatory role for this subunit. Remarkably, this n-accD gene encodes a protein that is about half the size of the plastidic version and shows similarity to only a third of the entire plastidic protein. The region of n-ACCD protein encoded by nuclear sequences is very short (∼100 amino-acids) and consists of little more than a target peptide-encoding sequence.
Several lines of evidence strongly indicate that the minimal n-accD gene of prokaryotic origin present in the Campanulaceae is functionally equivalent to the plastid gene found in most flowering plant species. Firstly, the plastid-derived region of the nuclear transcript encodes the carboxyl transferase domain that is the only known functional domain of the n-ACCD polypeptide. Within the Campanulaceae, the 3'end of n-accD that corresponds to the accD carboxylase domain is highly conserved compared to the 5'end of the gene, consistent with it also encoding a functional domain in this nuclear gene. Secondly, the carboxyl transferase domain of the nuclear protein maintains the "PLIIVCASGGARMQE" motif that is considered to be the accD putative catalytic site (Lee et al. 2004). Thirdly, we demonstrated that this nuclear transcript encodes a transit peptide that targets the ACCD precursor proteins to the chloroplast. Fourthly, we were able to isolate a similar nuclear transcript from all the Campanulaceae species that have lost pt-accD whereas it was missing in related species where a pt-accD gene is retained in the plastome. Finally and despite testing many primer pairs designed in various regions of the plastidic or nuclear accD gene, we have never been able to amplify or sequence any other candidate accD transcripts in Campanulaceae. All these results are in accordance with the functional replacement of the plastidic accD with this minimal nuclear version of accD in the Campanulaceae. Following activation of accD in the nucleus, both nuclear and chloroplastic copies presumably are functional for a period of time. Co-expression of an organellar gene in two different cellular genetic compartments has only been reported for a few mitochondrial genes (cox2, rpl5, sdh4) in land plants (Adams et al. 1999;Sandoval et al. 2004;Choi et al. 2006). In our study, none of the 18 species tested showed co-expression of a nuclear and plastidic accD. We were unable to find any vestige of n-accD (using multiple primer pairs) in the Goodeniaceae and Asteraceae species that possess a functional pt-accD, suggesting that the relocation of the gene occured soon after the functional gene transfer, most likely either prior to or immediately after the emergence of the Campanulaceae lineage. In conclusion, these results provide an example of the evolutionary processes leading to the functional relocation of a chloroplastic gene to the nucleus. The plastidic accD gene, which has been lost independently in diverse glaucophyte, diatom, protist or land plant species (Martin et al. 1998), has been functionally replaced by an abridged nuclear-encoded gene of prokaryotic origin in the Campanulaceae. This transfer has involved the acquisition of additional nuclear sequence that provides both gene expression and protein targeting back to the plastid. This functional relocation has been accompanied and probably facilitated by extensive rearrangements of the plastome in Campanulaceae species.

Plant material and plant growth conditions
Eighteen species belonging to the Asterales were investigated, consisting of two Adenophora

Isolation of nucleic acids and cDNA synthesis
Genomic DNA was isolated from 100 mg of fresh leaf tissue using the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany). Total RNA was prepared using an RNeasy Plant Mini kit (Qiagen, Hilden, Germany) and genomic DNA was removed using a TURBO DNA free kit (Ambion, Austin, TX).
Reverse transcription (RT) was then performed using an Advantage RT-for-PCR kit (Clontech, Mountain view, CA) with oligo(dT) primers. All kits were used in accordance with the manufacturers' instructions.

PCR and RT-PCR amplification
Amplifications were performed using KapaTaq polymerase (Kapa Biosystems, Woburn, MA) or Phusion High-Fidelity polymerase (Finnzymes) following the manufacturers' instructions. For PCR reactions using KapaTaq, gDNA or cDNA was denatured at 95°C for 2 min and amplified using 35 cycles of 95°C for 30 s, 50-60°C for 30 s and 72°C for 1 min. For PCR reactions using the Phusion High-Fidelity polymerase, gDNA or cDNA was denatured at 95°C for 2 min and amplified using 35 cycles of 98°C for 10 s, 60°C for 15 s and 72°C for 1 min.

Primers used for PCR amplifications
All primers used for amplifications (PCR, RT-PCR, RLM-RACE and TAIL-PCR) and sequencing are listed in Table SIII. To amplify part of the nuclear accD transcript in T. caeruleum, RT-PCR of T. caeruleum cDNA was undertaken using primers accD-Aster-F2 and accD-Aster-R1. To determine the presence of an intron in the n-accD gene of the different species, genomic DNA was PCR amplified using primers nuc-accD-intron-F and nuc-accD-intron-R. To detect the presence of n-accD or pt-accD transcripts in each species, RT-PCR was undertaken using primers specific to the nuclear (nuc-accD-F and nuc-accD-stop-R) or plastidic copy (pt-accD-F and pt-accD-R). The rDNA-ITS regions of each species was amplified from gDNA using the primer pairs rDNA18S-F and rDNA28S-R or ITS-F and ITS-R.

DNA sequencing reactions
PCR products were cleaned using the QIAquick PCR purification kit (Qiagen, Hilden, Germany), cloned into pGEM-T vector (pGEM-T Vector System 1, Promega, Madison, WI) after adenosine addition according to the manufacturers' instructions. Positive and independent clones were purified using the GenElute Plasmid Miniprep kit (Sigma-Aldrich, St-Louis, USA) and sequenced using universal primers or randomly designed primers.

TAIL-PCR
TAIL-PCR was performed as described (Liu and Whittier 1995) using degenerate primer AD6 (Sessions et al. 2002)

and gene specific primers accD-TAIL-R1, accD-TAIL-R2 and accD-TAIL-R3
to obtain the sequences at the 5'end of the n-accD gene and the degenerate primer AD6 and gene specific primers accD-TAIL-F1, accD-TAIL-F2 and accD-TAIL-F3 for the 3'end. T. caeruleum genomic DNA was used as template for the PCR reactions.
The probes were generated by PCR using primers accD963F2 and accD1593R (accD) or cp957F and cp1469R (psbA) from Helianthus annuus genomic DNA as template. Detection and quantification was performed using a Typhoon Trio Imaging system and ImageQuant TL software (GE Healthcare, Buckinghamshire, UK).

Vector construction
Expression cassettes pTc-accD, P35S:GFP and pTc-accD:GFP were created using the pGreen system of binary vectors (Hellens et al. 2000). pGreen0029 contains the nptII gene flanked by the NOS promoter and terminator. The 35S terminator and promoter from pPRVIIIA:neoSTLS2 (Huang et al. 2003) were cloned into pGreen0029 using HindIII/BamHI and NotI/XbaI respectively to generate P35SPT. To create the pTc-accD construct that contains the n-accD transcript under the regulatory control of the 35S promoter and terminator, the n-accD transcript was amplified from T. caeruleum cDNA with nuc-accD-XbaIF and nuc-accD-XbaIR primers (both with 5' XbaI sites). The PCR product was digested with XbaI and cloned into P35SPT. To create the P35S:GFP construct (control), the GFP gene was amplified from the ptGW vector using GFPXbaIF and GFPXbaIR primers (both primers contained a XbaI site at its 5' end). The PCR product was digested with XbaI and cloned into P35SPT.
To generate the pTc-accD:GFP construct that contains the target peptide-encoding sequence (TP) of the T. caeruleum n-accD gene in frame with the GFP gene and between the 35S promoter and terminator, the GFP gene was amplified from the ptGW vector using GFP-AfeIF and GFP-SpeIR primers (primers contain AfeI and SpeI sites at their 5' ends respectively). The PCR product was digested with AfeI and SpeI and cloned into pTc-accD. In this contruct, the first 668 nucleotides of the n-accD transcript (containing the TP) were fused in frame with the GFP ORF. All these amplifications were performed using the Phusion High-Fidelity polymerase (Finnzymes) and each expression cassette was sequenced prior to use.

Production of stable transgenic tobacco lines and expression analysis of fluorescent proteins by microscopy
An Agrobacterium strain containing pSoup (Hellens et al. 2000) was transformed with p35S:GFP or pTc-accD:GFP using the 'freeze-thaw' method (An et al. 1988). These strains were used to transform N. tabacum (Wisconsin) using a standard leaf disc method and kanamycin selection (300 mg/l) (Mathis and Hinchee 1994). Leaf material from explants showing high GFP expression were cut into small sections, submerged in 10 μ g/mL propidium iodide for 10 min, mounted in water and viewed by confocal laser scanning microscopy using a TCS NT/SP microscope (Leica).

Pairwise analyses of nucleotide substitution rates
Non-synonymous (K a ) and synonymous nucleotide substitutions (K s ) rates were calculated using Mega5 software (Tamura et al. 2011) and the Nei-Gojobori method (Jukes-Cantor). For pairwise analysis, the chloroplastic accD sequence from Rananculus macranthus (basal eudicot: NC_006796) was used as a reference. A codon-based Z test of selection was then used to determine the type of selection acting on the n-accD gene in Campanulaceae species.

Molecular phylogenetic analyses
The ITS1 and ITS2 regions from each species were sequenced to verify the good identification of the accessions. A sequence matrix of 19 species was obtained by multiple alignment using Geneious (Drummond AJ et al. 2010) and by adjusting the resulting alignment manually. The data matrix was analysed using PHYML (Guindon and Gascuel 2003) and the GTR (General Time Reversible) model (Tavaré S 1986) or the Neighbor Joining method, with N. tabacum as outgroup. Bootstrap analyses were performed with 10,000 replicates (Felsenstein 1985). Phylogenetic trees were drawn and edited using Archaeoptryx (http://www.phylosoft.org/archaeopteryx). The positions of the species within the phylogenetic tree were congruent with previous phylogenetic studies (Albach et al. 2001;Eddie et al. 2003).

Accession Numbers
The sequence data presented in this paper have been submitted to GenBank with the following accession numbers JQ693016-JQ693033.   and AccD-Aster-R1) specific for the conserved 3' end of the pt-accD gene were used for amplification of n-accD transcripts from T. caeruleum. B, Ethidium bromide-stained agarose gels showing RT-PCR amplification products from cDNA of T. caeruleum. A n-accD transcript was amplified from T. caeruleum cDNA using a primer pair specific for the conserved 3' end of Asterid pt-accD genes (F2 and R1), but not by a second primer pair (F1 and R1) specific for a larger region of pt-accD sequence.

SUPPLEMENTAL MATERIAL
Annealing sites of these primers are indicated in A. H. annuus cDNA was used as a positive control for each primer pair.    A, Positions of the primer pairs that allowed the specific amplification of the plastidic accD (pt-accD: ∼350 bp) or of the nuclear accD (n-accD: ∼970 bp) transcripts. B, RT-PCR results showing the amplification of pt-accD only from Asteraceae species while n-accD was only amplified from Campanulaceae species. Campanula alliarifolia was the only Campanulaceae species for which no n-accD amplification product was obtained, but a n-accD transcript could be amplified in that species using another primer pair. -ve indicates the absence of template. A primer pair specific for the plastidic psbA gene was used as a positive amplification control for all species. Nucleus-encoded accD predicted protein sequences from six Campanulaceae species (Campanula punctata, Trachelium caeruleum, Campanula carpatica, Adenophora lilifolia, Jasione montana and Lobelia erinus) and plastid-encoded accD protein sequences from two Asteraceae species (Helianthus annuus and Lactuca sativa) are aligned. The predicted plastidic transit peptide encoding-sequence present at the amino-terminus of the nuclear encoded accD protein is boxed with hatched lines. The black boxes indicate the deletion events that occurred in the n-accD ORF of some Campanulaceae species. The location of the intron in the n-accD gene is indicated by a black triangle.   Figure 1. Amplification of a partial nuclear accD (n-accD) transcript that encodes a conserved region of the plastidic accd (pt-accD) gene in Trachelium caeruleum. A, Multiple percent identity plots of pt-accD gene sequences from Asterid species compared with a Helianthus annuus pt-accD reference sequence. The percent identity (50% -100% scale to right of Lactuca sativa plot) of each gap-free aligning sequence is indicated. Two primer pairs (Accd-Aster-F1 and Accd-Aster-R1; AccD-Aster-F2 and AccD-Aster-R1) specific for the conserved 3' end of the pt-accD gene were used for amplification of n-accD transcripts from T. caeruleum. B, Ethidium bromide-stained agarose gels showing RT-PCR amplification products from cDNA of T. caeruleum. A n-accD transcript was amplified from T. caeruleum cDNA using a primer pair specific for the conserved 3' end of Asterid pt-accD genes (F2 and R1), but not by a second primer pair (F1 and R1) specific for a larger region of pt-accD sequence. Annealing sites of these primers are indicated in A. H. annuus cDNA was used as a positive control for each primer pair. . Panels A and D show chlorophyll fluorescence, panels B and E show GFP fluorescence while panels C and F are merged images of panels A + B and D + E, respectively. These images demonstrate that the n-ACCD-GFP fusion proteins are targeted to the plastids contained in these tobacco guard cells. Good. fam. Figure 4. Absence of accD in Campanulaceae plastomes. Total DNA from 14 Campanulaceae, three Asteraceae and one Goodeniaceae species was subjected to slot blot DNA hybridisation using an accD probe and a plastidic psbA probe (control) obtained using Helianthus annuus genomic DNA as template. The phylogenetic relationships of the species used in this analysis, based upon rRNA ITS1 and ITS2 sequences and a neighbor-joining analysis, are shown on the left. The absence of strong hybridization in Campanulaceae suggests that this gene is no longer present in the high copy number plastomes of these species. The only exception is in C. alliariifolia (highlighted with an asterisk) where a weak accD hybridisation (10%) was observed. A black circle on the phylogenetic tree indicates the evolutionary time point when the putative pt-accD deletion event occurred. A single sample of triplicate loadings is presented.