Regular Motifs in Xylan Modulate Molecular Flexibility and Interactions with Cellulose Surfaces1[CC-BY]

The presence of regular glycosidic motifs in xylan modulates its molecular flexibility and the affinity toward hydrophobic and hydrophilic cellulose microfibril surfaces in secondary plant cell walls. Xylan is tightly associated with cellulose and lignin in secondary plant cell walls, contributing to its rigidity and structural integrity in vascular plants. However, the molecular features and the nanoscale forces that control the interactions among cellulose microfibrils, hemicelluloses, and lignin are still not well understood. Here, we combine comprehensive mass spectrometric glycan sequencing and molecular dynamics simulations to elucidate the substitution pattern in softwood xylans and to investigate the effect of distinct intramolecular motifs on xylan conformation and on the interaction with cellulose surfaces in Norway spruce (Picea abies). We confirm the presence of motifs with evenly spaced glycosyl decorations on the xylan backbone, together with minor motifs with consecutive glucuronation. These domains are differently enriched in xylan fractions extracted by alkali and subcritical water, which indicates their preferential positioning in the secondary plant cell wall ultrastructure. The flexibility of the 3-fold screw conformation of xylan in solution is enhanced by the presence of arabinofuranosyl decorations. Additionally, molecular dynamic simulations suggest that the glycosyl substitutions in xylan are not only sterically tolerated by the cellulose surfaces but that they increase the affinity for cellulose and favor the stabilization of the 2-fold screw conformation. This effect is more significant for the hydrophobic surface compared with the hydrophilic ones, which demonstrates the importance of nonpolar driving forces on the structural integrity of secondary plant cell walls. These novel molecular insights contribute to an improved understanding of the supramolecular architecture of plant secondary cell walls and have fundamental implications for overcoming lignocellulose recalcitrance and for the design of advanced wood-based materials.

Secondary plant cell walls are formed after primary cell wall deposition once the cells have ceased to expand, resulting in woody tissues with remarkable material properties. While primary cell walls have a plastic, dynamic, and highly hydrated architecture enabling cell growth, secondary cell walls consist of a less hydrated, dense and thick network of biomacromolecules with fully differentiated structure in multiple lamellae, providing mechanical strength and rigidity to vascular plants. Secondary cell walls are composed of oriented cellulose microfibrils that are embedded in a tightly connected matrix of hemicelluloses and polyphenolic lignins, with welldefined ordering from the nanoscale to the macroscale (Cosgrove and Jarvis, 2012;Burgert and Keplinger, 2013). The cellulose microfibrils in secondary cell walls can associate in bundles that limit the accessibility of the original microfibril surfaces and exhibit increased disorder levels toward the surface, which may indicate a certain microfibril twisting (Fernandes et al., 2011;Cosgrove and Jarvis, 2012). The hemicelluloses are distributed between the cellulose microfibrils and also are connected to lignin, but the nature of these associations is still not well understood. Supramolecular interactions between cellulose microfibrils, hemicelluloses, and lignins in secondary plant cell walls are fundamental for its biomechanical integrity and functionality in different environments (Whitney et al., 1999(Whitney et al., , 2006Åkerholm and Salmen, 2001;Ryden et al., 2003;Salmén, 2004;Cosgrove and Jarvis, 2012;Burgert and Keplinger, 2013). However, the molecular features that modulate the interactions among cell wall biopolymers, their 3D macromolecular conformations in planta, and the nature of the nanoscale forces that control cell wall integrity are still controversial (Silveira et al., 2013;Cosgrove, 2014;Mikkelsen et al., 2015). Spatial and temporal heterogeneity in the composition and molecular structure of plant cell wall polysaccharides may be directed by biosynthetic processes in order to modulate their molecular diversity, adaptability, and biological function (Burton et al., 2010); however, the biological causes and effects of these fine structural variations are unknown. We are starting to understand how the molecular structure of hemicelluloses regulates the association with cellulose microfibrils (Bromley et al., 2013;Busse-Wicher et al., 2014, 2016a, 2016b and the occurrence of covalent linkages with lignins in lignin-carbohydrate complexes (Lawoko et al., 2005). The presence of intermolecular forces and the interconnected supramolecular architecture in lignocellulosic biomass are responsible for its recalcitrance against conversion into valuable biofuels and platform chemicals (Mortimer et al., 2010;Ding et al., 2012;Silveira et al., 2013).
Xylans represent the main hemicellulose component in the secondary plant cell walls of flowering plants (hardwoods), and together with galactoglucomannans, they are present as well in the secondary cell walls of conifers (also referred to as softwoods). In conifers, a close association between glucomannan and cellulose has been reported, whereas for xylan that association is less clear (Åkerholm and Salmen, 2001;Salmén, 2004). The general molecular structure of xylans consists of a backbone of b-(1→4) D-xylopyranose (Xyl) units, which can be decorated by glycosyl substitutions and chemically modified (by acetylation) depending on the plant type, tissue, and developmental stage. In particular, conifer xylans are of the arabinoglucuronoxylan (AGX) type, with a-(1→2) D-GlcA in its 4-O-methylated form (mGlcA) and a-(1→3) L-arabinofuranose (Ara) as the backbone main decorations and with no reported acetylations ( Fig. 1; Escalante et al., 2012;Busse-Wicher et al., 2016b;McKee et al., 2016). A regular distribution of glucuronic acids was reported previously in the xylans extracted from softwoods, whereas the pattern in hardwoods seemed to be irregular (Jacobs et al., 2001). Recent studies, however, have shown that glycosidic and acetyl decorations are preferably evenly spaced in specific domains of the xylan backbone (Bromley et al., 2013;Busse-Wicher et al., 2014;Chong et al., 2014), which could sterically allow favorable interactions with the hydrophilic surfaces of cellulose microfibrils (Busse-Wicher et al., 2016a, 2016b. Moreover, backbone substitution influences the solubility of hemicelluloses and their macroscopic properties Sternemalm et al., 2008;Pitkänen et al., 2009;Escalante et al., 2012;Bosmans et al., 2014;Littunen et al., 2015).
In this study, we investigate the substitution pattern of softwood xylans and the effect of distinct intramolecular motifs on the conformation in solution and on the interaction with cellulose surfaces, combining comprehensive mass spectrometry (MS)-based glycan sequencing and in silico molecular dynamic (MD) simulations. The presence of regular motifs with both evenly spaced and consecutive glycosyl decorations on the backbone is confirmed for xylan extracted from Norway spruce (Picea abies) secondary cell walls. MD simulations suggest that these substitution motifs in xylan modulate the affinity for cellulose hydrophilic and hydrophobic surfaces and stabilize the conformational transition of xylan from a 3-fold screw in solution to a 2-fold screw onto the cellulose microfibrils. This offers new molecular insights on the structural integrity and the supramolecular architecture of plant secondary cell walls, with fundamental implications to overcome plant biomass recalcitrance and an optimized utilization of wood-derived materials.

Enzymatic Profiling Reveals Regular Ara and mGlcA Substitution Patterns in Spruce Xylan
Two different xylan fractions were extracted from spruce wood using an alkaline process (AGX-A) and a hydrothermal process by subcritical water (AGX-H), as presented in Supplemental Figure S1. The carbohydrate composition of the two purified xylans is presented in Supplemental Table S1. AGX-A shows lower relative glucuronation and higher arabinosylation compared with AGX-H. These differences in substitution may arise from the different extractability of distinct xylan populations in softwoods or may be induced by the extraction process. Detailed fingerprinting of the substitution pattern of AGX was performed using xylanolytic enzymes with known substrate recognition toward the presence of decorated glycosyl units (Fig. 1;Supplemental Fig. S2;Pell et al., 2004;Vardakou et al., 2008;Pollet et al., 2010;St John et al., 2011). End-point incubation of AGX with a GH10 b-xylanase releases xylobiose (X 2 ) and Xyl as main products, together with two main limiting decorated oligosaccharides (P 3 U m ; P 3 and P 4 ). Sequencing of these decorated oligosaccharides by liquid chromatography (LC) coupled with electrospray ionization (ESI)-tandem mass spectrometry (MS/MS) confirms that these structures correspond with U m XX and A 3 X with the substituted sugar unit in the nonreducing end and with XA 3 X (Supplemental Fig. S3), in agreement with the cleavage mechanism of GH10 b-xylanases (Pell et al., 2004;Pollet et al., 2010). Hydrolysis by a GH11 b-xylanase provides X 2 and X 3 as the main hydrolytic products, together with XU m XX and XA 3 X as the main limiting decorated oligosaccharides (Supplemental Fig. S3; Vardakou et al., 2008;Pollet et al., 2010). Minor amounts of longer oligosaccharides can be observed in the digestion products of GH10 and GH11. This confirms the combined presence of domains in AGX with loose distribution of substitutions that favor enzymatic saccharification together with clustered substitution regions that prevent the hydrolytic action of the xylanases by steric hindrance, as we observed in our previous study (McKee et al., 2016).
The use of a specific b-glucuronoxylanase (GH30) releases acidic oligosaccharides based on the recognition of the (m)GlcA substitution at the xylosyl residue in position 22, which enables the identification of the GlcA spacing (Fig. 1;St John et al., 2011;Bromley et al., 2013). Interestingly, two main aldouronic acid oligosaccharides (P 6 U m and P 7 U m ) can be observed in AGX-A (Fig. 1). Smaller oligosaccharides (P 2 U m -P 5 U m ) can be observed as well with minor relative abundance. Further incubation with an additional exoa-arabinofuranosidase (GH51) removes the potential Ara substitutions and provides singly substituted aldouronic acids with increasing lengths (Fig. 1). Interestingly, the peak corresponding to P 7 U m disappears from the HPAEC-PAD and ESI-MS profiles, whereas the relative abundance of P 6 U m increases. Therefore, these oligosaccharides can be ascribed to a xylohexaose containing a single Ara and mGlcA substitutions and a xylohexaose containing one mGlcA substitution, respectively, indicating the existence of a predominant repetitive spacing of mGlcA on every six Xyl units in the backbone. Further sequencing of the oligosaccharide motifs in AGX-A was performed by LC-ESI-MS/MS. Single ion monitoring (SIM) reveals the presence of isobaric oligosaccharides in the different series from P 3 U m and P 7 U m (Fig. 2). Remarkably, few isomers can be observed for each specific mass-tocharge ratio (m/z), indicating the presence of nonrandom structures for each isobaric series. Two main peaks can be observed in the SIM chromatograms for AGX-A at m/z 1,263 and 1,423. ESI-MS/MS fragmentation of the main oligosaccharide after derivatization at m/z 1,263 confirms the placement of the mGlcA substitution in the xylosyl residue at position 22 from the reducing end; this corresponds with the structure XXXXU m X (Fig. 2), following the systematic nomenclature proposed for xylooligosaccharides (Fauré et al., 2009), in agreement with the cleavage pattern for GH30 glucuronoxylanases. The identical position of the mGlcA substitution was verified for other acidic oligosaccharides in the series (XU m X, XXU m X, XXXU m X, and XXXXXU m X), as presented in Supplemental Figure S4.
The SIM chromatogram for m/z 1,423 revealed the presence of two oligosaccharides with Ara substitutions that disappear with the addition of the GH51 arabinofuranosidase. The structure of the main oligosaccharide could be fully sequenced (Fig. 2), and the position of the Ara substituent was located two Xyl units away from the mGlcA substitution point (xylosyl residue at position 24 from the reducing end), corresponding with the structure XXA 3 XU m X. A similar relative placement of the Ara unit two residues away from the mGlcA was determined for the A 3 XU m X fragment sequenced for m/z 1,103 (Supplemental Fig. S4). The minor oligosaccharide motif A 3 XXXU m X also was sequenced (Supplemental Fig. S4), with the Ara substitution located in the nonreducing end, four Xyl units away from the U m substitution point (xylosyl residue at position 26 from the reducing end). It is significant that the placement of the Ara and mGlcA substitutions in these motifs always occurs in even Xyl positions, confirming a controlled level of regularity in spruce AGX. These two main repeating oligosaccharide motifs (XXXXU m X and XXA 3 XU m X) also were found in the digestion profile of the alcohol-insoluble residue of the debarked stem of three other coniferous species, namely Douglas fir (Pseudotsuga menziesii), black pine (Pinus nigra), and juniper (Juniperus communis; Busse-Wicher  (Pell et al., 2004;Vardakou et al., 2008;Pollet et al., 2010;St John et al., 2011). The position of the hydrolytic activity is marked with a wedge at the glycosidic linkage between the 21 and +1 subsites. C, Oligosaccharide fingerprinting with high-pH anion-exchange chromatography with pulsed amperometric detection (HPAEC-PAD) for the spruce AGX-A. D, Oligomeric mass profiling with ESI-MS of the AGX-A. Note that P refers to a pentose (Xyl or Ara) and U m refers to mGlcA. et al., 2016b). We can hereby confirm that these structures also are found in lignified softwoods from the genus Picea, and we here extend the regular and even placement of Ara and mGlcA substitutions to other minor motifs (A 3 XXXU m X and A 3 XU m X) also present in AGX.

Differences in Substitution Pattern between Xylan Populations Extracted by Alkali and Subcritical Water
The digestion of AGX-H with GH10 and GH11 releases similar oligosaccharide profiles as for AGX-A, with a marked presence of longer recalcitrant oligosaccharides (Supplemental Fig. S2). The SIM chromatograms for AGX-H after digestion with the GH30 glucuronoxylanase ( Fig. 2) verify the occurrence of XXXXU m X as the main repetitive motif, with a lower abundance of XXA 3 XU m X. The loss of Ara substitutions in AGX-H could be ascribed to hydrolytic cleavage during subcritical water, since Ara moieties are more labile compared with the resistant mGlcA substituents (Willför et al., 2009). The use of a buffered pH mitigates backbone degradation by autohydrolysis (Ruthes et al., 2017); however, the occurrence of backbone hydrolysis during subcritical water extraction cannot be completely excluded. Higher relative abundance of shorter alduronic acids (XU m X, XXU m X, and XXXU m X) can be observed, corresponding with intramolecular domains with tighter mGlcA spacing, in agreement with the higher mGlcA-Xyl ratio (Supplemental Table S1). AGX-H exhibits a series of acidic oligosaccharides with multiple mGlcA units (P 4 U m 2 , P 5 U m 2 , P 6 U m 2 , and P 7 U m 2 ) barely detected in AGX-A (Fig. 2). Remarkably, separation and sequencing of the oligosaccharides corresponding to m/z 1,481 (P 6 U m 2 ) showed that only one isomer is present and reveals the univocal and adjacent position of the mGlcA substitution in neighboring Xyl units (Fig. 2), corresponding to structure XXXU m U m X. An identical structure with consecutive placement of mGlcA units was confirmed for the other oligosaccharides in the X n U m U m X series (Supplemental Fig. S5). This pattern of glucuronation in xylans, where two mGlcAs are attached consecutively to neighboring Xylp backbone units, has been reported previously for larch (Larix spp.; Shimizu et al., 1978), sugi (Cryptomeria japonica) and hinoki (Chamaecyparis obtuse; Yamasaki et al., 2011), and suggested for spruce (Jacobs et al., 2001). The presence of major domains with even distribution of the mGlcA units along the backbone and minor domains with uneven glucuronation was already reported in glucuronoxylan from Arabidopsis (Arabidopsis thaliana), which was ascribed to the action of two distinct glucuronyltransferases (Bromley et al., 2013). This structural information reveals that the intramolecular glycosylation pattern in AGX is more complex than reported previously (Busse-Wicher et al., 2016b). We here confirm (1) the relative abundance of regular motifs with even placement of Ara and mGlcA  Domon and Costello, 1988). C, Classification of the oligosaccharide motifs in spruce xylan based on the substitution pattern.
units along the Xyl backbone, with a predominant spacing of mGlcA every six units, (2) the presence of minor motifs with tighter and odd glucuronation spacing, and (3) the existence of domains with adjacent mGlcA substitutions (Fig. 2).

Influence of Regular Substitution Motifs on the Backbone Conformation in Solution
Four oligosaccharides with a common xylohexaose backbone were considered to study the effect of the substitution pattern on the xylan conformation in solution, in agreement with the sequenced structures in Figure 2B: (1) unsubstituted xylohexaose (X 6 ); (2) xylohexaose with one mGlcA a-(1→2)-linked to a xylosyl residue in position 22 (XXXXU m X); (3) xylohexaose with an mGlcA a-(1→2)-linked to a xylosyl residue in position 22 and an Ara a-(1→3)-linked to a xylosyl residue in position 24 (XXA 3 XU m X); and (4) a xylohexaose with two mGlcA substituents a-(1→2)linked to consecutive xylosyl residues in positions 22 and 23 (XXXU m U m X). The nomenclature used for the specific glycosidic linkages in the xylo-oligosaccharides (XOs) is presented in Figure 3. Further details about the molecular dynamic simulations of the sequenced spruce XOs are presented in Supplemental Text S1 and Supplemental Figures S6 to S8. The conformation of the xylan backbone can be modeled by the two dihedral angles (w and c) related to the rotations around the two covalent C-O bonds that form each glycosidic linkage (Fig. 3). Previous modeling studies show that the preferred conformation for xylan in solution is a twisted 3 1 -fold helical screw conformation (Mazeau et al., 2005;Busse-Wicher et al., 2014), the same as was reported previously for the experimentally determined crystal structure of xylan hydrate (Nieduszynski and Marchessault, 1971). Free energy surfaces for the backbone glycosidic linkage conformations are presented in Figure 4 (for X 6 and XXA 3 XU m X) and Supplemental Figure S9 (for all other glycosidic linkages). They are divided into four different regions, where region 4 is by far the most populated one. Region 4 is further divided into two parts, 4 (2) and 4 (+) . The sum w + c is indicative of the saccharide chain conformation (French and Johnson, 2009), where a value of 420°represents a left-handed 3 1 -fold helical screw and 360°indicates a flat 2 1 -fold conformation. Here, 4 (+) corresponds to a mean value of w + c of approximately 440°, and 4 (2) gives w + c 390°. Since these subregions are nearly equally populated, the average value of w + c becomes ;415°, meaning that the conformation can be characterized on average as a flexible 3 1 -fold helical screw. This is in agreement with our previous observations for X 2 and X 4 , where the same simulation methodology was used as in this study (Berglund et al., 2016).
The introduction of a single a-(1→2)-linked mGlcA, as in XXXXU m X and XXA 3 XU m X, did not have a significant impact on the conformational space of the glycosidic linkages in aqueous solution. However, even if the main conformation of XXXU m U m X still is the same, the probability of finding GL4 in region 1 increased 4-fold compared with X 6 (Supplemental Fig.  S9), which causes a backbone twist resulting in the two mGlcA substituents pointing out in the same direction (Supplemental Fig. S10). Nevertheless, this effect is not present for the control simulations on the deprotonated oligosaccharides XXXU m(2) U m(2) X (Supplemental Fig.  S11). This is not surprising, since the side groups then would repel each other, and the conformational behavior seems to be pH dependent. Otherwise, deprotonation showed a small effect on the solution structures on the acidic oligosaccharides (Supplemental Fig. S11). Deprotonation of acidic groups may have significance on pH and ionic strength regulation in plant cell walls (White et al., 2014). Furthermore, the presence of an a-(1→3)-linked Ara disturbed the conformational space of GL2 in XXA 3 XU m X and became more prone toward twisted conformations [4 (+) and 3 (+) ], with a decrease in the probability for 4 (2) (Supplemental Fig.  S10). This change in backbone flexibility may be caused by steric hindrance from the Ara decoration, which removes the possibility for hydrogen bonding over the glycosidic linkage (O59.H-O3; Supplemental Table S2). On the other hand, both Ara and mGlcA are highly hydrated. Therefore, side groups can be envisioned as highly hydrated moieties of the xylan macromolecule.

Xylan Binding onto Cellulose Hydrophilic and Hydrophobic Surfaces
Solid-state NMR on never-dried Arabidopsis stems (Simmons et al., 2016) and technical pulp model systems (Larsson et al., 1999;Teleman et al., 2001) indicate that xylan undergoes a conformational change when interacting with cellulose. Based on in silico predictions of NMR chemical shifts by density functional theory and MD simulations, this change has been identified as the transition from a 3 1 -fold helical structure to a flat 2 1 -fold screw (Busse-Wicher et al., 2016b;Simmons et al., 2016), which intuitively would facilitate close interaction between the xylan polymer and the flat cellulose surface, and even cocrystallization. Therefore, we initially placed the XOs in conformations that corresponded to the cellulose Ib structure (which constitute a 2 1 -fold screw) in an extra, hypothetical surface layer.
The cross section of cellulose microfibrils has been a topic of debate over the last decade. The crystal structure of the cellulose Ib allomorph (Nishiyama et al., 2002) offers several possible crystallographic planes to be exposed at the surfaces, where the (1-10), (110), and (010) surfaces, on the one hand, and the (200) surface, on the other hand, are generally referred to as hydrophilic and hydrophobic cellulose surfaces, respectively. Depending on which model is assumed for the cross section, different crystallographic planes will be exposed, and in various proportions (see discussion in Cosgrove, 2014). Traditionally, the choice has been to assume a square cross section, primarily exposing the (110) and (1-10) surfaces, with the hydrophobic (200) face only present at two corner chains. This model has the benefit of being essentially hydrophilic, which makes sense in an aqueous environment. Ding and Himmel (2006) have advocated a model that builds on the rectangular model but in which several corner chains are removed, producing a diamond-shaped cross section. This model thus has a larger proportion of hydrophobic surface exposed to the surroundings. Finally, the rectangular model (Fernandes et al., 2011) was recently proposed as the model that best fitted a combination of small-angle neutron scattering and wide-angle x-ray scattering on spruce samples. This model exposes the (010) and (200) planes and, thus, has equal proportions of hydrophilic and hydrophobic characteristics. At present, it is not known which model is closest to the truth, although it is clear that it can have  (2) ]. B, Free energy surfaces (w, c) for the backbone glycosidic linkages in X 6 and XXA 3 XU m X. The probabilities (%) for each region within each free energy surface are presented, and the SE for the probabilities is 0.0 to 1.6. C, Structure and conformation of X 6 and XXA 3 XU m X. For the 2 1 -fold conformation, the glycosidic linkages are (280,°80°), and for the 3 1 -fold conformation, they are (290°, 130°). The linkage connecting the Ara side group is (295°, 250°), and that for mGlcA is (80°, 90°), which are common minima for the respective side groups (Supplemental Fig. S15).
great implications on the interactions between microfibrils and with other constituents of the plant cell wall. In this work, we chose to limit the investigation to one hydrophilc (1-10) surface and the hydrophobic (200) surface. The implications of this choice are elaborated further in "Discussion." Figure 5 shows the free energy space of w and c from XOs interacting with both hydrophilic and hydrophobic cellulose surfaces. Since a sum of w + c = 360°, which corresponds to w/c values along the diagonal, indicates a rigid 2 1 -fold screw (French and Johnson, 2009), it is clear that this conformation is maintained for the three central glycosidic linkages of X 6 throughout the simulations. Interestingly, just as in solution, the presence of an Ara side group impacts the backbone conformation also when interacting with a surface. Common conformations of XXA 3 XU m X systems are shown in Figure  5. Although the 2 1 -fold conformation is still common, the area in the free energy space becomes larger and more twisted glycosidic linkages become frequent. The XXA 3 XU m X (both protonated and deprotonated) docked on the (200) surface undergoes dynamic changes during MD simulations, from the Xyl backbone parallel to the cellulose chains to a backbone twist in the Xyl at position 24 (Supplemental Video S1), and eventually it returns to the parallel conformation again. This flexibility could favor the affinity of these XXA 3 XU m X motifs toward cellulose microfibrils that exhibit twisted conformations in the axial dimension (Fernandes et al., 2011). On the other hand, mGlcA substitutions do not affect significantly the conformational space of the xylan backbone, either in their protonated or deprotonated state (Supplemental Figs. S12-S14). The conformational space for mGlcA both in solution and on the hydrophilic surface was similar, with the main minima at w = 80°and c = 90°. However, when interacting with the hydrophobic surface, the minima became more localized (Supplemental Fig. S15), indicating restricted motion. Interestingly, the mGlcA side group, especially at 23 in XXXU m U m X interacting with (200) cellulose, shows minima around w = 100°and c = 300°. This conformation of the mGlcA unit at 23 seems to interact better with the hydrophobic surface, as depicted in Figure 5. Additional snapshots from the simulations of the different XOs on the (1-10) and the (200) surfaces are depicted in Supplemental Figures S16 to S18.

Quantification of Molecular Adsorption Interactions between Xylan and Cellulose Surfaces by Pulling Out Experiments
After extensive equilibration of the oligosaccharides adsorbed on cellulose surfaces, the XOs were slowly pulled off the cellulose surface by applying a harmonic potential on the surface normal component of the center of mass distance between the cellulose and the XO, with a reference value that increased with a constant velocity of 0.1 nm ns 21 , until the oligosaccharide was completely separated from the surface (see description in "Materials and Methods," and Supplemental Text S1). These simulations were used to calculate the reversible work of adhesion (Gibbs' free energy) of the oligosaccharides to cellulose (Table I; Fig. 6). The calculated free energies indicate that side groups generally improve the interaction between XOs and cellulose surfaces, with the structure XXXU m U m X overall exhibiting the strongest ones. The weakest interactions with the (1-10) surface were those of the unsubstituted X 6 structure, together with XXXXU m X having a single substitution, followed closely by XXA 3 XmUX. It is noteworthy that the latter two were positioned with their side groups pointing away from the surface, whereas XXXU m U m X, which exhibited the strongest interaction, has one side group pointing directly to the cellulose. This is in agreement with a recent study that showed that a-(1→2)-linked substitutions improve the interaction between xylan and cellulose (Pereira et al., 2017). These results also clearly show that the interaction with the hydrophobic (200) surface is stronger compared with the hydrophilic (1-10) one. This was also suggested previously (Pereira et al., 2017), where it was seen that xylan diffused from hydrophilic to hydrophobic surfaces at elevated temperatures. Therefore, a substituted structure is a clear advantage for the interaction with the (200) hydrophobic surface, where the weakest interaction was observed for the unsubstituted structure. In this case, for the substituted oligomers, no side groups are pointing away from the surface, as all residues lie in the same plane. The effect of deprotonation on the interactions with the different surfaces shows no clear trend on either hydrophobic or hydrophilic surfaces. Hydrogen bonding of the XOs was analyzed with both cellulose and water (Supplemental Fig. 19). As expected, hydrogen bonds rarely occur between the XOs and the hydrophobic surface (#1), while four to five hydrogen bonds were formed on average with the hydrophilic one.
The free energy is a state variable; thus, the calculated desorption energies depend only on the end states and not on the actual path between them. But the reaction path can still convey useful information about the system. Since the pull force was applied to the center of mass, the XOs could adopt to the pulling by following a relatively low-energy trajectory. This means that parts of the XOs that were weakly bound were more likely to detach first. The simulation trajectories showed that detachment always started at a chain end, followed by a process during which the rest of the XO was slowly peeled off. For the unsubstituted XOs, detachment was initiated randomly at either chain end, but for the substituted ones, detachment was much more likely to start at the end farthest away from the substitution, which, in this case, was the nonreducing end. This supports the notion of the side groups acting as anchors of the xylooligosaccharides to the cellulose surface. A series of snapshots showing the conformation of a disubstituted XO during a pull-off simulation is given in Supplemental Figure S20.

DISCUSSION
In this study, we report the existence of intramolecular motifs with distinct placement of glycosyl substitutions along the backbone in xylan extracted from spruce secondary cell walls. We have identified a major motif with a regular substitution pattern consisting of XXXXU m X and XXA 3 XU m X, with one mGlcA unit regularly placed every six Xyl units and an Ara moiety locked 22 units from the mGlcA, respectively. These oligosaccharide motifs also have been found in xylan extracted from the stem of other coniferous species (Busse-Wicher et al., 2016b), and their occurrence here is independently verified in softwoods from the genus Picea. This prevalent even spacing of substitutions in xylan had been proposed as well for acetylation in Arabidopsis glucuronoxylan (Busse-Wicher et al., 2014;Chong et al., 2014). Additionally, we have identified the presence of minor xylan domains with a tighter and uneven placement of mGlcA substitutions and a novel repetitive motif with two mGlcAs placed in adjacent positions. This demonstrates the occurrence of a more complex and controlled molecular regularity in softwood xylans than hitherto reported. Bromley et al. (2013) already proposed the presence of molecular domains with different uronic acid spacing for glucuronoxylan in Arabidopsis, which was attributed to the action of two different glucuronyltransferases. The presence of distinct domains with even and clustered placement of glycosyl decorations may be directed by the 3D spatial arrangement of the glucuronyltransferase and arabinofuranosyltransferase catalytic units with respect to the nascent xylan chain. Indeed, the biosynthesis of AGX is still not well known (Rennie and Scheller, 2014), and future studies on the catalytic machinery and enzyme complexes in the Golgi apparatus should cast further light.
The hydration properties and the conformation of cell wall polysaccharides in aqueous solutions have Figure 5. Conformations of XOs located on cellulose surfaces. A, Free energy surfaces of X 6 , XXA 3 XU m X, and XXXU m U m X. The energy bar and notation of the glycosidic linkages are the same as in Figures 3 and 4. B, Snapshots of XXA 3 XU m X and XXXU m U m X on (1-10) and (200) surfaces. Additional snapshots from the simulations of XOs on cellulose surfaces are presented in Supplemental Figures S17 to S19.
been matters of intense discussion in recent times. The different b-(1→4) glycosidic linkage types present in the backbone of hemicelluloses show distinct conformation flexibility in solution prone to a 3 1 -fold helix, with the xylan backbone being the most flexible compared with the (gluco)mannan and glucan (cellulose) backbone types (Berglund et al., 2016). Here, the effect of the repetitive glycosidic substituents in stabilizing the solution conformation of the xylan backbone has been evaluated. The presence and the protonation state of a-(1→2)-mGlcA substitutions do not seem to exert a strong influence on the conformation of the neighboring b-(1→4)-Xyl units. On the other hand, the presence of a-(1→3)-Ara increases chain flexibility, which can be attributed to the hindered intramolecular hydrogen bonding possibilities of the xylan backbone caused specifically by the presence of a substituent at the O-3 position. Importantly, the presence of Ara and mGlcA side groups affects the probability of interaction with surrounding water by hydrogen bonding compared with the xylan backbone, where intermolecular hydrogen bonds also play a role (Berglund et al., 2016). Therefore, these glycosyl substitutions enhance the hydration state of the xylan macromolecule locally in these positions and prevent the possibility of intermolecular and intramolecular interactions between xylan backbone segments from the same or different macromolecules. This has a strong influence on the aggregation of xylan in dispersions and on the hydrodynamic properties of xylan-based materials. Indeed, higher Ara and mGlcA contents have been reported to hinder the supramolecular aggregation of xylans in aqueous systems Pitkänen et al., 2009;Bosmans et al., 2014) and to influence the reactivity and macroscopic properties of xylan-based materials Escalante et al., 2012;Littunen et al., 2015).
Our MD data show the differences in the xylan conformation depending on the surrounding environment in the plant cell wall. The conformation of XOs in solution is that of a relatively flexible 3 1 -fold screw, whereas XOs that are docked onto cellulose surfaces retain a flat, relatively rigid, 2 1 -fold screw conformation. This is in agreement with both modeling studies and NMR data showing a distinct change in conformation in xylan when there is cellulose present (Teleman et al., 2001;Simmons et al., 2016). Here, we examined these observations for XOs bound to the hydrophilic (1-10) surface and the hydrophobic (200) one, and our simulations show that the 2 1 -fold screw backbone conformation does not differ significantly between these surfaces. Interactions between xylan and cellulose on the hydrophilic surface have been analyzed previously in terms of hydrogen bonding (Zhang et al., 2015;Busse-Wicher et al., 2016b). Interestingly, we find that interactions between XOs and cellulose are stronger on the hydrophobic surface than on the hydrophilic one. As our analysis shows that there is virtually no hydrogen bonding taking place between the xylan and the cellulose on these surfaces, it is clear that the interaction is of a different nature. It is instructive to make an analogy to the structure of cellulose, which consists of stacked sheets with an intricate network of in-plane hydrogen bonds. Although the recalcitrance of this structure often has been ascribed to hydrogen bonds alone, computer simulations show that the in-plane hydrogen bonds are 1 order of magnitude weaker than  The diagrams exhibit the pulling mechanical free energies (PMF; in kJ mol 21 ) for X 6 , XXXXU m X, XXA 3 XU m X, and XXXU m U m X (in protonated and deprotonated forms) as a function of the distance (in nm) from the celluloses surfaces.
the intraplane interactions that exist between the sheets (Bergenstråhle et al., 2010). This interaction arises as a hydrophobic effect from the relatively large penalty in free energy of hydrating the pseudoflat hydrophobic surfaces of the pyranose rings. For the case of xylan, the flat 2 1 -fold screw conformation mimics that of cellulose chains, and a similar effect arises here. On the hydrophilic surfaces, the interaction is a combination of both hydrogen bonds and hydrophobic effects, but on the (200) surface, hydrophobic effects dominate, and the interaction becomes stronger. Although the conformation of the backbone is the same on both surfaces considered here, the hydrophobic surfaces seem to stabilize the conformation of the side groups docked on them compared with the hydrophilic surfaces. This conclusion is in agreement with previous observations reporting the migration of an AGX chain from the hydrophilic surface to the hydrophobic surface of cellulose upon thermal treatment at 160°C using MD simulations (Pereira et al., 2017). The MD docking experiments of different XOs with distinct Ara and mGlcA patterns show that the placement of the substitutions is well tolerated with both hydrophilic (1-10) and hydrophobic (200) surfaces. In line with our findings, MD simulations of xylan adsorption onto a cellulose surface in gas phase showed that side groups did not affect the ability of one xylan molecule to adsorb onto a surface (Mazeau and Charlier, 2012). In addition to this, our free energy calculations showed that side groups, and especially the XXXU m U m X structure, where mGlcA are pointing toward the cellulose surface, had a positive impact on the interaction between xylan and cellulose. These observations challenge a previous study, where it was suggested that the common structure XXA 3 XU m X, with side groups spaced at every other xylan unit, is favored, since the side groups can point out to the same side, which makes xylan fit well onto the cellulose (Busse-Wicher et al., 2016b). Orienting the side groups away from the surface could potentially allow for a closer packing of multiple xylan chains, possibly allowing the formation of a layer of single xylan molecules fully coating the cellulose surface, as is proposed for the hydrophilic (110) surface (Busse-Wicher et al., 2016b;Simmons et al., 2016). On the other hand, however, it is possible that the gain in interaction energy by having the side groups in contact with the cellulose would compensate for the loss of near-crystalline order and the stabilizing effect of neighboring chains. The effect of consecutive side chains on the subsequent binding of other xylan macromolecules onto cellulose surfaces, and the energetic balance between the anchoring effect of the side groups of a single xylan chain in contact with the cellulose surfaces and the stabilization by the binding of two adjacent xylan chains, should be the subject of further studies. In addition, these observations likely depend on which cellulose crystal model is used. On the hydrophilic (010) and (020) surface models used previously (Busse-Wicher et al., 2016b), xylan adsorbs into distinct grooves present at the surfaces. Here, xylan having the side groups pointing out on the same side does indeed seem favored. But if the hydrophobic (200) surface is considered instead, having side groups pointing out on opposite sides is not an obstacle for close xylan-cellulose interaction. On the contrary, having side groups on alternating sides improves the interaction, as in the case of XXXU m U m X. An interesting parallelism can be made with xyloglucan, another highly substituted hemicellulose that exhibits strong interaction with cellulose, where the side group residues are believed to anchor the xyloglucan onto the cellulose (Levy et al., 1991;Hanus and Mazeau, 2006;Cosgrove, 2014;Zhao et al., 2014). The fact that the part of the XOs without side groups detached before substituted residues in the pull-off simulations further strengthens this idea (Supplemental Fig. S20).
In our study, the XOs were oriented parallel to the cellulose chains, as suggested earlier (Busse-Wicher et al., 2016b), although theoretically other directions also are possible. Considering that the antiparallel structure of cellulose II is thermodynamically more stable than parallel cellulose I (Langan et al., 2001;Goldberg et al., 2015), it is not unlikely that hemicelluloses deposited next to the cellulose fibrils orient in an antiparallel fashion. Moreover, other orientations (e.g. perpendicular or random) cannot be completely ruled out. There also could be a difference between the (1-10) surface investigated here and the (110) surface, which closely resembles (1-10) but has a slightly different tilt and spacing of the cellulose chains. Another complicating factor is that the Ib crystallographic unit cell contains two chains, implying that only every other surface chain is identical. Therefore, the binding free energy also could depend on where on the surface the xylan is deposited. Finally, the role of hydrophobic surfaces in wood microfibrils is a question that will depend on which cross section is assumed. In the rectangular model, the hydrophobic surfaces are abundant and quite extended, similar to the computational model used here. However, considering a diamond-shaped cross section, the hydrophobic surfaces are abundant but much narrower, only two to three chains wide, whereas in a square cross section, they are present only as single chains at the corners. The above-mentioned limitations do limit, to some extent, the general applicability of our results and should be a subject for future studies.
Our results indicate that the presence of glycosyl substitutions in xylan with controlled and well-defined placement along the backbone is not only sterically tolerated on both hydrophilic and hydrophobic surfaces but also favored in terms of xylan conformation and adsorption onto cellulose surfaces. This observation is contradictory to previous studies of in vitro adsorption of xylans from solution onto cellulose surfaces, in which a lower degree of substitution was correlated to a higher level of xylan adsorption (Köhnke et al., 2008(Köhnke et al., , 2011Bosmans et al., 2014). Additionally, technical xylans with lower glycosyl substituents formed during chemical pulping and readsorbed onto cellulose microfibrils improve pulp strength (Danielsson and Lindström, 2005). However, the nature of the intermolecular interfaces between xylans adsorbed in vitro onto cellulose are different from those of xylans docked onto the surface of cellulose microfibrils during plant cell wall biosynthesis and assembly. In vitro adsorption processes may involve not only the quantification of xylan directly in contact with the cellulose surfaces but also xylan that accumulates in multiple layers due to crowding effects. In this latter case, the presence of side groups in xylans that specifically are beneficial for cellulose interactions might, at the same time, impose hindrances for additional xylan molecules to adsorb next to it, resulting in an overall decreased in vitroquantified adsorption. This self-aggregation behavior of xylan with reduced substitution has been observed in plant cell wall analog models based on bacterial cellulose (Mikkelsen et al., 2015). This indicates that the results of in vitro binding experiments must be taken with caution (Cosgrove, 2014;Wang et al., 2015).
Finally, our integrated results by MS-based sequencing of spruce xylan and MD simulations impose further questions regarding the association of xylan with cellulose microfibrils and lignin in plant secondary cell walls. The major and minor domains with even and clustered backbone glycosylation in xylan are differently enriched in the AGX extracted from an alkaline procedure with previous delignification (AGX-A) and in the AGX populations extracted with subcritical water without delignification (AGX-H), respectively. The relative abundance of the minor domains with tighter and consecutive substitution spacing is small compared with the major domains, but they become enriched in the AGX populations extracted with subcritical water. We here hypothesize that the minor domains represent AGX populations with closer connection with lignin. The function of tighter mGlcA positioning might be to provide local environments with acidic pH in the plant cell wall, which could catalyze the formation of lignincarbohydrate complexes through covalent cross-links with lignol precursors (b-O-4 bonds and g-ester bonds) during the complex radical polymerization processes that occur during lignification. We have observed that the presence of XXXU m U m X moieties induces conformational changes that could end with the two mGlcA moieties pointing out in the same direction, which would sterically favor the random formation of such carbohydrate-lignin bonds (Supplemental Fig. S10B). The presence of major and minor xylan domains with distinct glucuronation patterns already has been reported (Bromley et al., 2013). These domains were hypothesized to be present in the same xylan molecules based on the impossibility of separating such domains based on charge and size. Unfortunately, our study cannot ascertain whether the two extracted AGX-A and AGX-H fractions represent two different populations in planta or whether the differences in the relative amounts of the major and minor domains in our AGX-A and AGX-H populations arise from different original xylan molecules. We hypothesize that the alkaline treatment with previous delignification may alter the minor domains with consecutive glucuronation, which may be in closer contact with lignin. On the other hand, the subcritical water extraction may preserve intact these minor domains, becoming enriched in the AGX-H fractions. Indeed, subcritical water extraction is able to retain the feruloylation in arabinoxylan extracted from wheat (Triticum aestivum) bran (Ruthes et al., 2017). The presence of the major and minor domains with even and clustered glucuronation in the same or different xylan molecules is fundamental to their ability to cross-link between cellulose microfibrils and/or with lignin. Further studies on intact secondary cell walls by nondestructive techniques such as solidstate NMR may cast some light on the specific placement of xylan domains with respect to cellulose microfibrils and lignin components.

CONCLUSION
Our MS sequencing data on the substitution pattern of softwood xylan, together with the molecular modeling of xylan conformation in solution and interaction with cellulose surfaces, offer fundamental molecular insights into the assembly of secondary plant cell walls. The presence of predominant regular motifs with even placement of glycosyl (mGlcA and Ara) substitutions along the backbone is here confirmed for spruce AGX, together with minor motifs with odd and consecutive glucuronation spacing. The presence of glycosyl substitutions has no strong influence on the flexible 3 1 -fold screw of the xylan backbone in solution; however, they may act as local moieties prone to hydration. Xylan oligosaccharides adopt a flat, relatively rigid, 2 1 -fold screw conformation when they are docked onto hydrophilic and hydrophobic cellulose surfaces. This conformation is stabilized by the presence of glycosyl substituents and is not hindered by the consecutive placement of mGlcA substitutions, as demonstrated by the simulations. These results highlight the importance of nonpolar driving forces for the structural integrity of secondary plant cell walls, and they indicate that both hydrophobic and hydrophilic cellulose surfaces both should be strongly considered in future molecular models of plant cell walls. It is worth mentioning here that our modeling experiments consider a simplified model of conifer secondary cell walls without the interference of other cell wall components, such as O-acetylated galactoglucomannan and lignin. Indeed, previous studies have reported the close association of glucomannan and cellulose fibers in spruce wood (Åkerholm and Salmen, 2001), which may alter the interactions between xylan and cellulose. The confirmation of the specific placement of xylan domains with distinct substitution motifs with respect to cellulose microfibrils (including the different hydrophilic and hydrophobic surfaces), galactoglucomannan, and lignin components in lignified cell walls by nondestructive techniques remains an exciting challenge.
Until then, these new molecular insights on the presence of intramolecular repetitive motifs in xylans and their potential effects on the association with cellulose microfibrils contribute to an improved understanding of the supramolecular architecture of secondary cell walls, with fundamental implications for the development of processes to overcome lignocellulosic recalcitrance and for the design of advanced wood-based materials and products.

Purification of Softwood Xylan and Compositional Analysis
AGX was extracted and purified from Norway spruce (Picea abies) secondary cell walls using two complementary processes: (1) alkali extraction after delignification (AGX-A) following the procedure reported by Escalante et al. (2012) and (2)

Oligosaccharide Profiling and Sequencing
The oligosaccharide profiles after enzymatic digestion were analyzed by HPAEC-PAD as reported previously (McKee et al., 2016). Linear xylooligosaccarides (X 2 -X 6 ; Megazyme) were used as external standards. Oligomeric mass profiling was performed by ESI-MS using a Q-TOF2 mass spectrometer (Micromass). The hydrolysates were desalted with HyperSep Hypercarb cartridges (Thermo Fisher), dissolved in 50% acetonitrile and 0.1% formic acid, and infused directly into the positive mode-operated Q-TOF2 mass spectrometer through a syringe pump at a rate of 5 mL min 21 . Capillary and cone voltages were set to 3.3 kV and 80 V, respectively.
Oligosaccharide sequencing was achieved after the separation of labeled oligosaccharides by LC-ESI-MS/MS. Derivatization was performed by reduction in 2% borohydride (30 min) and permethylated in dimethyl sulfoxide with CH 3 I as described previously (Ciucanu and Kerek, 1984). The organic phase was recovered after partition in CH 2 Cl 2 :water, dried, and resuspended in 50% acetonitrile. The labeled oligosaccharides were separated through an SB-C18 column (250 3 4.6 mm; Agilent Technologies) in a Capillary LC (Micromass) at a flow rate of 10 mL min 21 and a gradient of increasing acetonitrile content (30%-60%) over 110 min. MS and MS/MS analyses were performed with a quadrupole time-of-flight (Q-TOF) system (Waters) in positive mode at 3.3 kV and 60 V in the capillary and the cone, respectively. Argon was used as the collision gas for MS/MS analysis of selected ions, at a voltage of 35 to 90 V, to analyze the diagnostic fragmentation patterns of the oligosaccharides.

MD Simulations
MD simulations were performed on xylan oligomers both free in solution and docked to cellulose surfaces. The simulations lasted for 50 ns and were run with GROMACS 5.1.2 Abraham et al., 2015), employing the GLYCAM06 force field for carbohydrates (Kirschner et al., 2008) and the TIP3P water model (Jorgensen et al., 1983). Lennard-Jones interactions used a cutoff of 1.2 nm, and electrostatic forces were calculated using particle mesh Ewald summation (Darden et al., 1993;Essmann et al., 1995) with a 1.2-nm real space cutoff. Bonds were kept at their equilibrium values by P-LINCS (Hess 2008) for the saccharides and by SETTLE (Miyamoto and Kollman, 1992) for water. In simulations of solution structures, constraints were applied to all covalent bonds, whereas for simulations with cellulose, constraints were used on bonds involving hydrogens only. In addition, harmonic restraints were used on xylan residues to keep them in the 4 C 1 conformation. No scaling was used for one to four interactions, according to the GLYCAM convention.
During the simulations, constant pressure was kept at 1 atm using a Parrinello-Rahman barostat (Parrinello and Rahman, 1981), and temperature was controlled using a Nosé-Hoover thermostat (Nosé, 1984;Hoover, 1985). In simulations of xylan docked to cellulose, the temperature was maintained at 300 K, whereas for simulations in solution, a replicate-exchange scheme was applied (Sugita and Okamoto, 1999), with 12 replicates at temperatures ranging from 300 K to 366 K (evenly spaced by 6 K) and with attempts to exchange between neighboring replicates at every 10 steps.
Free energies for the glycosidic bond dihedral angles (w and c) were calculated by Boltzmann inversion of the probabilities. Grid spacing in the plots is 5°, and contour levels are drawn at every 2 kJ mol 21 . Energy is anchored to 0 kJ mol 21 , and the highest energy level, for conformations that are never attained, is set to 22 kJ mol 21 . Hydrogen bonds were analyzed by the gmx hbond tool in GROMACS with the default geometrical criteria (donor-acceptor distance , 0.35 nm and hydrogen-donor-acceptor angle , 30°). The free energy of adsorption of docked xylan structures was calculated from the exponential average over 25 nonequilibrium pull-off simulations of 20 ns each, using Jarzynski's equality (Jarzynski, 1997). Additional information on the construction of the model systems and the free energy calculations is given in Supplemental Text S1.

Supplemental Data
The following supplemental materials are available.
Supplemental Figure S1. Scheme of the extraction and purification processes of spruce AGX.
Supplemental Figure S3. LC-ESI-MS/MS fragmentation and sequencing of relevant oligosaccharides released by digestion with GH10 and GH11 b-xylanases.
Supplemental Figure S4. LC-ESI-MS/MS fragmentation and sequencing of the X n U m oligosaccharide series released by digestion with a GH30 b-glucuronoxylanase.
Supplemental Figure S5. LC-ESI-MS/MS fragmentation and sequencing of the oligosaccharide series with consecutive glucuronation released by digestion with a GH30 b-glucuronoxylanase.
Supplemental Figure S6. Charge distribution of the protonated GlcA unit that was used in this work.
Supplemental Figure S8. Glycosidic linkage conformational space in X 6 , comparison between the force fields GLYCAM06 and CHARMM.
Supplemental Figure S9. All free energy surfaces from the different xylooligomers in water solution.
Supplemental Figure S10. Conformations of substituted xylooligomers in solution.
Supplemental Figure S11. Free energy surfaces for XXXXU m(2) X and XXXU m(2) U m(2) X in water solution.
Supplemental Figure S12. Free energy surfaces of backbone glycosidic linkages of xylooligomers on a hydrophilic (1-10) cellulose surface.
Supplemental Figure S13. Free energy surfaces of backbone glycosidic linkages of xylooligomers on a hydrophobic (200) cellulose surface.
Supplemental Figure S15. Free energy surfaces from the linkages connecting the side groups with the Xyl backbone.
Supplemental Figure S16. Snapshots from the simulations of xylooligomers on a (1-10) cellulose surface.
Supplemental Figure S17. Snapshots from the simulations of xylooligomers on a (200) cellulose surface.
Supplemental Figure S19. Average number of hydrogen bonds between xylooligomers and the cellulose surface and xylooligomer and the surrounding water.
Supplemental Figure S20. Snapshots from one of the pulling simulations of XXXU m U m X from the (1-10) surface.
Supplemental Table S1. Composition of the AGX (as wt % of dry weight) by monosaccharide analysis.
Supplemental Table S2. Average number of hydrogen bonds for the X 6 and XXA 3 XU m X xylooligomers in water solution.
Supplemental Video S1. XXA 3 XU m X on (200) showing how the XO move on the cellulose surface.
Supplemental Text S1. Additional information from the MD simulations.