- © 2019 American Society of Plant Biologists. All Rights Reserved.

## Abstract

Modern phenotyping techniques yield vast amounts of data that are challenging to manage and analyze. When thoroughly examined, this type of data can reveal genotype-to-phenotype relationships and meaningful connections among individual traits. However, efficient data mining is challenging for experimental biologists with limited training in curating, integrating, and exploring complex datasets. Additionally, data transparency, accessibility, and reproducibility are important considerations for scientific publication. The need for a streamlined, user-friendly pipeline for advanced phenotypic data analysis is pressing. In this article we present an open-source, online platform for multivariate analysis (MVApp), which serves as an interactive pipeline for data curation, in-depth analysis, and customized visualization. MVApp builds on the available R-packages and adds extra functionalities to enhance the interpretability of the results. The modular design of the MVApp allows for flexible analysis of various data structures and includes tools underexplored in phenotypic data analysis, such as clustering and quantile regression. MVApp aims to enhance findable, accessible, interoperable, and reproducible data transparency, streamline data curation and analysis, and increase statistical literacy among the scientific community.

Advances in data acquisition methods have enabled the rapid collection of a vast range of multivariate biological datasets from plants, animals, single-cell systems, and more (Marx, 2013). The development of next-generation high-throughput phenotyping platforms in particular has led to data-rich experimental outputs that capture many physiological processes simultaneously across large populations and under various conditions through time. And yet, the usefulness of these emerging technologies is limited by high operational costs of commercial phenotypic platforms and the ability of researchers to curate, integrate, and explore these complex outputs (Howe et al., 2008). The appropriate interpretation of phenotypic data often requires expertise in programming (R, Python, MATLAB) and statistics or access to expensive statistical software, such as SPSS or JMP. Although great efforts have been made to develop open-source data curation and analysis platforms for exploring RNA sequencing (McCarthy et al., 2017) and metabolomics data (Xia et al., 2015), the platforms for large phenomic datasets are not as advanced. Although the interactive tools used for genome-wide association studies (Seren et al., 2012) include a brief section on analyzing phenotypic data, the scope of such analysis is not comprehensive. While the software typically provided with the commercial phenotyping platform provides an insight on the observed trends during the experiment, the underlying code is not available, making it inflexible and oftentimes not suitable for various experimental designs. Additionally, most off-the-shelf tools lack modules crucial for multivariate analysis, such as principal component analysis (PCA), multidimensional scaling (MDS), or regression models, and none of them provides a standardized pipeline for outlier identification. Yet, standardized methods of data curation, processing, and analysis before publication are increasingly necessary as the scientific community strives for findable, accessible, interoperable and reproducible (FAIR) data (Wilkinson et al., 2016; Reiser et al., 2018). Such optimization of data processing tools and protocols will not only accelerate and standardize data exploration and visualization but will also promote better reproducibility and transparency of data curation and analysis. Tools for high-throughput phenotyping, like HTPmod (Chen et al., 2018), are gradually providing easier access to the tools for data analysis, but integrating the community input, keeping the code open-source, and keeping those tools up-to-date is crucial for sustainable and transparent data analysis.

To contribute toward this goal, we developed an open-source online platform called “MVApp,” an interactive and modular data curation and in-depth multivariate analysis pipeline. MVApp provides a comprehensive toolkit for the thorough exploration of diverse data structures, where the effect of multiple independent variables, including genotype, treatment, and time, can be examined. MVApp aims to help scientists and research staff with limited knowledge of “R” programming and statistics to curate their data in a standardized manner and to perform comprehensive and robust statistical tests within minutes. We developed MVApp in “R” using the “Shiny” framework (Chang et al., 2017), exploiting R’s computational potential by integrating various packages into a graphical user-interface. Examples of previously developed Shiny applications are abundant, including: box plot generators (Spitzer et al., 2014), RNA sequencing data analysis pipelines (Li and Andrade, 2017; Nelson et al., 2017), genomic prediction simulators (Morota, 2017), growth modeling (Chen et al., 2018), and tools for teaching plant breeding (Matias et al., 2018) or business analytics (Nijs, 2018). MVApp not only combines the standard methods for multivariate analysis but also helps with data interpretation. By following the comprehensive README (https://mmjulkowska.github.io/MVApp/), first-time users can decide which modules are of interest for their datasets. Further guidance is provided by in-app suggestions, which advise on decisions between specific methods (e.g. type of curve to fit, number of clusters to use) to apply for their datasets. The MVApp provides the output of the analytical tools and statistical tests along with guidelines for interpretation, e.g. what can be inferred from a statistical test based on the null hypothesis and resulting *P* value. Additionally, the R-commands used to make individual parts of the analysis can be viewed for each section, ensuring the full transparency of the analysis. To make our platform accessible and safe to use, the data uploaded to MVApp (http://MVApp.kaust.edu.sa/MVApp) are not saved on the server; the input is deleted when the session is closed.

In this article, we showcase the possibilities of MVApp using a dataset that includes the phenotypic data of nine Arabidopsis (*Arabidopsis thaliana*) accessions grown under either control or salt-stress conditions over the course of 8 d (Awlia et al., 2016). The dataset (Supplemental Dataset S1) was collected using the high-throughput phenotyping PlantScreen conveyor system (Photon Systems Instruments), which captured seven plant morphology traits that describe the rosette size and architecture, as well as 43 chlorophyll fluorescence traits, reflecting plant photosynthetic activity (Supplemental Dataset S2). In total, the dataset contains 50 phenotypes scored for 160 individual plants across seven time points. This dataset exemplifies data structures containing up to three independent variables. A similar analysis could be employed to examine natural diversity panels consisting of hundreds of genotypes, to compare the phenotypes of mutant and wild-type lines, or to explore phenotype-to-phenotype relationships across different genetic backgrounds. The spatial variation in individual phenotypes can also be inspected, though MVApp does not yet provide a spatial correction feature. The modular design of the MVApp ensures that other modules can be added in the future, dependent on community involvement, contributions, and feedback (https://github.com/mmjulkowska/MVApp/blob/master/CONTRIBUTING.md). We hope that the widespread use of MVApp will increase statistical literacy among early career peers, accelerate scientific discovery, and ensure transparency of data processing.

## RESULTS

### Dynamic Response Characterization

High-throughput phenotyping platforms allow for the nondestructive screening of plants and, therefore, the characterization of their dynamic temporal responses, such as growth or changes in photosynthetic efficiency. These dynamic responses can be summarized by fitting different types of curves (linear, quadratic, exponential, and square root) and polynomial functions to changes in the measured trait values. Here, we use the increase in rosette area as an example (Fig. 1). The simple functions (linear, quadratic, exponential, and square root) are fitted using a linear model through the data points. For quadratic, exponential, and square root functions, the phenotype (e.g. rosette area) is transformed using squaring, natural logarithm, or square root, respectively, before fitting the linear model. The linear fit for each sample can be examined using the fit-plot tab in MVApp (Fig. 1A). Based on the mean regression coefficient (*R*^{2}) of the curve, which is calculated for each individual sample, MVApp suggests the model with the best fit. Polynomial functions can also be fitted to the data points using cubic and smoothed splines. Because polynomial functions have high *R*^{2} values compared to the other models, they are not included in the best model-fit suggestions. In our example, we evaluate the increasing rosette area in the nine Arabidopsis accessions described above. We found that a quadratic function fit the dataset most accurately (*R*^{2} values of 0.9711 and 0.913 for the control and salt-treated plants, respectively), while the exponential model had the second-best fit (*R*^{2} values of 0.9668 and 0.907 for the control and salt-treated plants, respectively; Supplemental Dataset S3).

Curve fitting can also be used as a data curation method in which the user excludes samples whose *R*^{2} is below a chosen threshold (*R*^{2} < 0.7 is the default option). All of the fitted curves can be compared to the original data points and viewed on a fit plot to identify samples that do not follow the expected dynamics observed in the experiment (Fig. 1B). For example, in our dataset, we identified seven samples with *R*^{2} values <0.7 (Supplemental Dataset S4). By examining pictures of those specific samples, we realized that those plants had died during the experiment. Therefore, those samples were removed from the curated dataset for subsequent analyses.

The MVApp offers the option to view summary statistics based on the fitted curves. The user can choose whether the summary statistics are calculated for all the data points or for the data curated based on the *R*^{2}-value threshold. The summary of the growth dynamics can be viewed as box plots, violin plots, dot plots, or bar plots, where the significant differences between user-defined groups (e.g. based on genotype and treatment) are tested using one-way ANOVA and Tukey’s honestly significant difference (HSD) pairwise-comparison tests. For our dataset, we observed that the treatment, the genotype, and their interaction had significant effects on the rosette growth rate (Fig. 1C). “Col-0” and “Te” were the fastest growing accessions under control conditions, while the highest growth rate was observed in “Rsch” and “C24” under the salt-stress conditions. “Cvi” was the slowest growing accession under both the control and salt-stress conditions. Including the seven samples with *R*^{2} values below the threshold level of 0.7 did not significantly affect the observed trends (Supplemental Fig. S1).

### Outlier Identification

Data curation before any analysis is crucial but is oftentimes a lengthy, inconsistent, and manual process with a distinct paucity of standardized guidelines. To ensure consistency between individual experiments performed across different labs, methods used for data curation require standardization. MVApp accelerates and standardizes the curation of data by highlighting possible outliers based on user-selected grouping variables, such as treatment, genotype, and time point. These outliers are identified using various detection methods (Chaloner and Brant, 1988; Diaz-Garcia and Gonzlez-Faras, 2004; Leys et al., 2013), which include the interquartile range (1.5 × IQR), the Cook’s distance, the Bonferroni outlier test, or the sd from the median, based on one or multiple measured traits. If the input data contain a time element or a gradient series suitable for curve-fitting, then a low regression coefficient (indicating a poor fit of the trend line) can be used as an additional criterion for outlier selection. In our example, all traits were used for data curation. Samples were identified as outliers when the values for at least 12 measured traits from that specific sample were outside of the 1.5 × IQR. In this manner, 28 outliers were identified and removed from the curated dataset (Supplemental Dataset S5).

The graphical overview in the outlier removal tabs in MVApp allows the user to visualize the data both with and without the outliers to observe the effects of the selected outlier identification method. We use the trait rosette perimeter as an example (Fig. 2). The effects of outlier removal were particularly visible in the “Te” accession, where both the variance among individual time points and the number of data points outside the normal distribution range were visibly reduced. For other traits, such as nonphotochemical quenching (ϕ[NPQ]) or maximal quantum yield in the light-adapted state (*F _{v}*′/

*F*′; Supplemental Fig. S2), the effects of outlier removal were not as pronounced, with only very extreme data points being removed. As the removal of outliers can result in the loss of important information, we strongly encourage the user to inform their decisions concerning outlier removal and data curation by reviewing their experimental notes, raw data, and images.

_{m}### Tests for Significant Differences

MVApp facilitates standard statistical tests, including parametric and nonparametric tests. Initially, measured traits can be evaluated for normal distribution and equal variance and be visually examined using histograms, quantile–quantile plots, and box plots to validate whether the parametric test assumptions have been met. Responsive in-app message boxes provide assistance with the interpretation of the *P* values and selection of parametric or nonparametric tests. In our dataset, the rosette area follows a normal distribution across all accessions according to the Shapiro–Wilk test (Fig. 3A). As the parametric tests assume that the samples being compared have equal variance, MVApp integrates two tests for variance (Bartlett and Levene) and a graphical visualization of the observed variance using box plots for the observed data (*y*), the subtracted median (*y* − med[*y*]), and the absolute deviation from the median (abs[*y* − med(*y*)]). We observed an equal variance among the accessions for all the salt-stressed plants 7 d after salt treatment (*P* values > 0.05), but not for plants grown under control conditions (*P* values of 1.65 × 10^{−6} and 0.026 for the Bartlett and Levene tests, respectively; Fig. 3B). This suggests that the differences among the measured accessions are more pronounced under control conditions than under salt stress. Furthermore, the traits related to maximum quantum yield under the light-adapted state (*F _{v}*′/

*F*′) and ϕ(NPQ) showed a normal distribution and equal variance under both control and salt-stress conditions (Supplemental Fig. S3, A and B), indicating that genetic variation and the environment affect these traits less significantly than the rosette area.

_{m}Basic parametric tests such as ANOVA assume a normal distribution and equal variance. When these assumptions are not met, users should consider transforming the data or using a nonparametric test, such as Kruskal–Wallis. Both parametric and nonparametric tests are included in MVApp, together with Tukey’s test and the Mann–Whitney/Wilcoxon tests for pairwise comparison. As our data on rosette area did not meet the assumptions for a parametric test, we examined the differences among the accessions using the Kruskal–Wallis test. Significant differences were observed between accessions grouping for the final day of measurement under control and salt-stress conditions (Fig. 3C). On the final day of measurements, the “Col-0,” “Te,” and “C24” accessions had developed the largest rosettes under both conditions studied, while “Nd,” “Can,” and “Cvi” had the smallest. “Rsch” and “Co” exhibited the highest and lowest quantum yields (*F _{v}*′/

*F*′), respectively, measured across the nine accessions under both the control and salt-stress conditions (Supplemental Fig. S3 C). The exact opposite trend was observed for ϕ(NPQ; Supplemental Fig. S3 G). This suggests that the traits reflecting chlorophyll fluorescence might be complementary to the traits commonly used to describe rosette size, providing new insight into plant performance under various conditions.

_{m}The parametric tests implemented in MVApp include the one- or two-sample *t* test and one- or two-way ANOVA using Tukey’s HSD pairwise test. The nonparametric tests available in MVApp include the Kolmogorov–Smirnov test for examining the differences between two samples, as well as the Kruskal–Wallis and Wilcoxon tests, which are commonly used to compare multiple samples. All of these methods of analyses are accompanied by graphical visualizations in the form of box plots, violin plots, bar plots, and scatter plots. Additionally, MVApp facilitates analysis of the interaction between two predefined factors, such as treatment and genotype, using two-way ANOVA. Depending on the data structure, the interaction between two user-defined variables, such as genotype and treatment, can be examined in separate data subsets, e.g. across individual time points. In our datasets, we examined the interaction between genotype and treatment in the data subsets for individual time points. In the case of rosette area, we found a significant interaction between genotype and treatment starting from 4 d after the salt treatment and continuing through to the end of the experiment (Fig. 3D). Interestingly, the interaction between genotype and treatment was already apparent in *F _{v}*′/

*F*′ and ϕ(NPQ) just 1 d after treatment (Supplemental Fig. S3, F–H). This indicates that the chlorophyll fluorescence traits are more responsive to genotype-by-environment interactions than plant size.

_{m}### Correlation Analysis

Measuring multiple traits facilitates the study of phenotype-to-genotype relationships (e.g. comparing mutants to wild-type varieties), but also allows a better understanding of the phenotype-to-phenotype interactions. For instance, correlation analyses can be used to identify phenotypic traits that are highly correlated. Changes in correlation caused by different environments or genetic contexts, for example, can also be detected. These subtleties often remain unexplored. Interactive scatterplots can be used to further inspect the correlation between two chosen traits. Various data structures are accommodated by allowing the user to subset the data before the correlation analysis. As such, the MVApp allows the user to examine the strength, variability, and significance of correlations across individual subsets.

We examined the correlations for similarities and differences between the control and salt-stressed plants in our dataset (Fig. 4; Supplemental Dataset S6), which allowed us to identify major clusters of traits. Traits related to *F _{v}*′/

*F*′ and actual quantum yield in the light-adapted state (ϕ[

_{m}*P*]) were strongly correlated with each other, while traits reflecting photochemical quenching and electron transport beyond PSII photochemistry (qP, qL, and ETR) were less correlated with the single time-point measurements of chlorophyll fluorescence traits reflecting the maximum, minimum, and instantaneous fluorescence (

*F*,

_{m}*F*, and

_{o}*F*, respectively). All of these photosynthetic traits were found to be negatively correlated with traits reflecting NPQ parameters (ϕ[NPQ], non-photochemical quenching), but the strongest negative correlations were observed between

_{t}*F*′/

_{v}*F*′ and ϕ(

_{m}*P*). Additionally, various photosynthetic traits measured at different photon irradiances (Lss1–Lss4) showed highly significant correlations. The rosette area was positively correlated with the quantum-yield cluster of traits and was negatively correlated with the NPQ traits. However, the relationship between plant size and the NPQ traits was less pronounced under salt-stress conditions; many of the correlations were not significant (

*P*value > 0.05; Fig. 4).

### Reduction of Dimensionality

When multiple traits are measured from the same individual, a multidimensional phenotypic space that defines each sample can be obtained. However, the measured traits are likely to be highly correlated. Therefore, the dimensionality of the data can be reduced, thus simplifying the data while maintaining important trends and patterns.

The contribution of each measured trait to the overall observed variance can be examined using PCA. PCA allows users to determine the minimum number of dimensions required to adequately summarize the phenotypic variance. Observations of potentially correlated measured traits are transformed into a set of linearly uncorrelated variables, called “principal components” (PCs; Lever et al., 2017). The first PC accounts for the largest observed variance, providing the best possible summary of the variance observed across the experiment. As PCA is sensitive to relative scaling of the observed variables, we included the option to scale the data before performing PCA in MVApp. Additionally, depending on the data structure, the user can perform PCA on the specific data subsets, grouped by treatment or time point, for instance. In our example of a scaled dataset, 10 PCs were needed to describe at least 98% of the variance observed in the plants grown under both the control and salt-stress conditions (Fig. 5A; Supplemental Dataset S7). The highest contribution to the first PC came from traits related to NPQ and quantum yield under the light-adapted state (Fig. 5B; Supplemental Dataset S8), with no differences in contribution observed between the two conditions. When the data were reduced to the first two PCs, differences between individual accessions were found in both the salt-stress and control conditions (Fig. 5C). Although PCA is not uncommon for multivariate datasets, its implementation and interpretation can be challenging. Furthermore, the contributions of individual traits to PCs are seldom described. Those issues are addressed in MVApp, making it intuitive for the user to understand and explore the PCA, providing additional insight into the data.

Another method of reducing the dimensionality of a dataset containing multiple phenotypes is MDS. With MDS, the levels of similarity among the objects in a dataset are visualized based on their distances from each other. MDS can be done for individual samples in the dataset to see the similarity among individual replicates, or for all measured traits to determine their relationships and changes therein under different conditions. As MDS, like PCA, is also sensitive to the scaling of the variables, the scaling is optional. MVApp also allows the user to perform MDS on all the measured phenotypes and then cluster them using k-means clustering. In our dataset, we observed four clusters, which were related to rosette area, quantum yield, NPQ, and single time-point chlorophyll fluorescence parameters (Fig. 6). Interestingly, the traits were grouped into the same clusters and the positions of the traits respective to each other were almost mirror images for both the control and salt-stress conditions. These results suggest that the relationships among the traits do not change in response to salt stress.

### Cluster Analysis

Once the most interesting traits have been identified, they can be used to group samples with similar behavior into clusters. Clustering can be based on any number of measured traits and is used to reveal subgroups of samples within an experiment that exhibit similar response patterns. In MVApp, users can perform either k-means or hierarchical clustering analyses. K-means clustering assigns the individual samples to a user-defined number of centroids. To aid the user, MVApp provides a number of visual and cumulative methods for defining the optimal number of k-mean clusters. The clusters can be visualized using bar plots or scatter plots according to their measured traits to evaluate the contribution of the user-defined trait to the overall variation in the data. Hierarchical clustering, in contrast, does not require the user to predefine the number of clusters. Based on selected traits, the samples are sorted into a dendrogram. This is matched to a heatmap illustrating the value of each trait selected for clustering (Fig. 7A).

Using hierarchical clustering, we grouped the average values of the accessions scored on the final day of measurement based on traits reflecting rosette architecture, quantum yield, and NPQ using Ward’s method (Fig. 7A). From this, we found that accessions with a large rosette area showed high quantum yield and low NPQ, while accessions with a small rosette area showed low quantum yield and high quenching. Using the dendrogram, which represented the relationships between the individual samples, we selected a specific distance at which the samples were separated into three clusters (Fig. 7B; Supplemental Dataset S9). MVApp allows the examination of the differences between clusters using box plots, ANOVA, or Tukey’s test for significance (Fig. 7C) and automatically lists the traits that are significantly different between clusters. Cluster 1 included accessions grown mainly under salt-stress conditions, with a medium-sized rosette area and high NPQ. Cluster 2 contained the accessions with the largest rosette area, measured exclusively under control conditions. In Cluster 3, accessions grown under both salt-stress and control conditions were found, and these accessions had the smallest rosette area but a high quantum yield. No differences among the clusters were observed in the traits reflecting rosette architecture (Supplemental Fig. S4). The high photosynthetic performances of Clusters 2 and 3 were also reflected in ETR (Supplemental Fig. S4). These results confirm the previous observation that plant size is defined by photosynthetic efficiency (Fig. 4), which is represented by both quantum yield and NPQ. Additionally, the hierarchical clustering shown here indicates that a group of accessions can develop small rosettes despite high photosynthetic performance. As no differences in the single time-point measurements of chlorophyll fluorescence were found between Clusters 2 and 3, mechanisms other than photosynthetic efficiency may be limiting the growth of accessions belonging to Cluster 3. This type of cluster analysis for multivariate phenotypic data remains uncommon in the field, despite the intriguing insight it can provide into plant responses. Through its inclusion in MVApp, we aim to encourage the use of this type of analysis.

### Quantile Regression

Advances in phenotyping methods have enabled scientists to simultaneously examine multiple traits that could be contributing to a trait of major interest, such as plant size or yield. However, correlation and PC analyses limit this examination to the general trends observed in an experiment. Furthermore, the ordinary least-squares regression models that are typically used to evaluate trait contributions to yield or survival are based on the trait average values observed across a population. However, the contribution of individual traits to biomass or yield can differ between the smallest and largest plants within an experiment. Therefore, we integrated quantile regression, which identifies those traits that might contribute significantly to the trait of major interest in each quantile, into MVApp. As quantile regression is relatively new, quite uncommon, and challenging to set up—MVApp brings it within the reach of any scientist. The various data structures can be accommodated by selecting multiple grouping variables and performing the quantile regression on the subsets of the data. Additionally MVApp allows for results of quantile regression to be compared with the results obtained from traditional ordinary least-squares regression model.

For our dataset, we grouped the traits per treatment and performed the quantile regression for the subsets of individual time points. The ϕ(*P*) and ETR contributed to the size of the rosette area in plants grown under both the control and salt-stress conditions through time, though the contribution was higher for the plants grown under control conditions (Fig. 8A; Supplemental Dataset S10). Interestingly, neither quantum yield in the dark-adapted state nor NPQ contributed to the rosette area under either condition (Fig. 8A; Supplemental Dataset S10). When ϕ(NPQ) was chosen as the trait of major interest, we observed significant contributions of *F _{v}*′/

*F*′ in most quantiles; this trait’s contribution was mainly observed under the control condition (Fig. 8B; Supplemental Dataset S10). When either ϕ(

_{m}*P*) or ETR was chosen as the trait of major interest, these two traits were contributing only to each other, and no other chlorophyll fluorescence parameters contributed to those parameters, such as ϕ(NPQ) or

*F*′/

_{v}*F*′ (Fig. 8C; Supplemental Dataset S10). By performing quantile regression on different traits of interest, we developed an understanding of how the traits related to each other (Fig. 8D). We observed that ETR and

_{m}*F*′/

_{v}*F*′ were tightly linked to each other in all quantiles, affecting the majority of the measured traits and mutually affecting each other, but were not explained by any of the other measured traits. Although the trait contribution did not differ substantially among the individual quantiles in our example dataset, we think that it will be a useful feature for more complex datasets covering higher ranges of variation.

_{m}### Estimation of Heritability

Many high-throughput phenotyping experiments use data from forward genetics studies. For in-depth analyses in these types of studies, estimating heritability based on a limited number of genotypes is of utmost importance. MVApp allows the estimation of broad-sense heritability from the measured traits, thereby enabling an informed decision on whether a specific phenotypic trait has sufficient genetic variance to be used for forward genetic studies. In MVApp, heritability is calculated as the ratio of the total genetic variance to the total phenotypic variance. Depending on the data structure, heritability can be estimated for individual data subsets, such as treatment and time point. For our datasets, we calculated heritability of individual traits per treatment and time point. The heritability of individual traits varied between 0 and 0.9862 (Supplemental Dataset S11). Traits measured within the first 2 d exhibited low heritability (<0.6), which increased >0.8 after 4 d of measurement for all traits except the single time-point measurements of photosynthetic activity in light- and dark-adapted conditions (*F _{m}*,

*F*,

_{o}*F*in Lss1–Lss4). Traits associated with quantum yield, NPQ, and rosette area showed the highest heritability, while traits associated with rosette architecture and fluorescence in the dark-adapted state (

_{t}*F*,

_{m}*F*, and

_{o}*F*) had the lowest heritability. The estimated heritability was slightly higher in plants grown under the control condition (0.9081) than in plants grown under salt stress (0.875; Supplemental Dataset S11). As the median-estimated heritability for all the measured traits was >0.89, most of the measured traits seemed to have a genetic basis, which could be dissected using forward genetic screens.

_{t}## DISCUSSION

We live in the exciting time of high-throughput phenotyping, which can generate an avalanche of data reflecting the complexity of different biological systems (Fahlgren et al., 2015). However, analysis of these data takes a significant amount of time and effort (Howe et al., 2008), and comparing data outputs requires standardization of data curation and analysis. The MVApp provides a flexible analytical pipeline, which can deal with multiple data structures and accommodates multiple independent variables, such as treatment, genotype, and time points, which can be used to subset or group the data for individual analyses. The existing tools, such as the software HTPmod (Chen et al., 2018), recognize the need for visualization and modeling of the high-throughput phenotyping data but do not offer opportunities for future extensions. MVApp uniquely strives to make a first step toward a future framework for standardizing data curation, analysis processing, and visualization of diverse datasets, with community input in this process being clearly invaluable. Therefore, we encourage our peers to submit their comments and suggestions for new and improved feature of the MVApp as indicated in the contribution guidelines (https://github.com/mmjulkowska/MVApp/blob/master/CONTRIBUTING.md). By including in-app messages helping to interpret the results of various tests, MVApp provides a comprehensive guideline, which is particularly valuable to first-time users and early career researchers, for data curation, interpretation, and analysis. Streamlining and standardizing data analysis protocols will contribute to the FAIR data principles and improve the review process for publications and other scientific outputs. Our interactive tool for data curation and analysis, MVApp, strives to enhance the transparency of data curation and analysis by improving processing times and by reducing the need for expensive software and extensive knowledge of R or statistics. Despite the wide availability of data analysis tools, none of them are designed for time-efficient analyses of the big datasets generated by medium- and high-throughput phenotyping platforms. MVApp exploits the flexibility of R-based statistical analysis and combines it with a graphical user interface. The graphs and tables produced in MVApp can be downloaded in a publication-ready format with default figure legends, which contain information about the analysis performed and the preprocessing steps, including data curation, allowing the graphs to be easily reproduced. We think that applications such as MVApp will not only contribute to the availability of FAIR data, but also will encourage the scientific community to both share their raw data and standardize data curation so that the figures and analyses reported in scientific publications can be reproduced and better understood by the wider audience.

Outlier curation is an integral part of data analysis, but its importance is often overlooked. As none of the existing data-analysis pipelines includes outlier detection, the MVApp is a pioneer for streamlined data curation, providing a significant contribution to FAIR data processing. In the past, outlier curation methods were developed using coarse-to-fine models for the identification of abnormalities in phenotypes related to plant photosynthesis (Xu et al., 2015). The outlier selection methods in MVApp provide opportunities to identify the possible outliers independent of the nature of the phenotypic trait. In our first release of MVApp, we have included four different methods for outlier identification. Provided that the data contain a time or gradient component, curation can also be performed based on fitting a curve to the growth of individual plants. By visualizing the data before and after outlier identification, the user can make an informed decision about which samples to retain for further analysis. However, the user must be careful when removing data points, as the outliers themselves might contain important information (Altman and Krzywinski, 2016). Therefore, the original dataset containing all the samples, including outliers, remains available in the MVApp dropdown menu for subsequent analysis. We are not aware of any other application that integrates outlier curation that is as transparent and simple to use as the one integrated into MVApp.

For the large datasets that are often produced by high-throughput phenotyping platforms, MVApp offers different methods of dimensionality reduction. Dimension reduction by PCA or MDS can simplify the data by summarizing it in a limited number of dimensions. Both PCA and MDS are often used in metabolomic (Zhang et al., 2016), transcriptomic (Yano et al., 2018), and genomic studies (Miłobędzka and Muszyński, 2017) to help identify interesting patterns among samples. However, PCA and MDS are underexplored for use in large phenotypic studies. We propose that these methods can clarify trait contributions to observed variances, phenotype-to-phenotype relationships, and trends that change in response to components of treatment, genotype, or time. Including PCA and MDS in MVApp facilitates the broader use of these methods by the scientific community.

Although most studies still use linear regression models, focusing only on the contribution of traits for an average plant (Sellam and Poovammal, 2016; Sitienei et al., 2017), MVApp integrates quantile regression, which models the contribution of traits across the entire distribution of plants. Quantile regression can be used as a hypothesis-generating tool, identifying novel plant phenotypes with significant contributions to yield or stress tolerance, and the trait contribution can be estimated for individual quantiles. The use of quantile regression in the field of plant phenotyping is a recent concept. Its integration into the MVApp will help to develop a better understanding of the phenotype-to-phenotype interactions and the contribution of individual traits to the trait of major interest, e.g. survival, yield, or metabolite production. The quantile-regression approach can be applied to the field of plant breeding, where understanding the traits contributing to productivity is key to the development of superior plant varieties with increased yields.

MVApp combines several existing statistical R libraries into a pipeline (presented in Fig. 9), guiding the user through the interactive process of data curation, exploration, and analysis. The graphic user interface of the MVApp and the messages aim to provide better understanding and interpretation of the statistical test outputs, as well as empower the users without skills in command-line software to explore the full potential of multivariate analysis. In this article, we have presented the different functionalities of MVApp, which we aim to expand in the future. We encourage the submission of new modules, suggestions, and contributions from the scientific community to new releases of MVApp. Please see https://github.com/mmjulkowska/MVApp/blob/master/CONTRIBUTING.md for more information about how to contribute. Our goal for MVApp is that it will facilitate data analysis and statistical literacy across the scientific community by compiling different methods that are already in use across various disciplines. We hope that MVApp can unlock the potential of those methods and enhance the experience of data curation and exploration for researchers, especially those investigating phenotype-to-genotype and phenotype-to-phenotype relationships in phenotypic datasets.

## MATERIALS AND METHODS

### MVApp Setup

MVApp was written in “R” (R Core Team, 2015) and its interactive user interface was made from the “shiny” library (Chang et al.). The interactive plots were produced from the “plotly” (Sievert et al., 2017), “ggplot2” (Wickham, 2009), and “gplots” (Warnes et al., 2016) libraries. The color scales for plots were enabled using the “RcolorBrewer” (Neuwirth, 2014) and “colorRamps” (Keitt, 2012) libraries. The data table display was based on the “DT” library (Xie et al., 2018). Reshaping of the data tables into various formats was accomplished using “reshape” and “reshape2” (Wickham, 2007) libraries. Users can download all the graphs in the “.pdf” format, and all the tables in the “.csv” format. The data were scaled using the “scale()” function. The values represented in the tables were rounded to four decimal numbers using the “round()” function, but were left unaltered in the table available for download. All functions were part of the “stats” library in “R” unless indicated otherwise.

### Curve Fitting

The individual samples were separated based on sample ID, genotype, and selected independent variables. Depending on the fitted function, the trait data were not transformed (linear) or were transformed using square root, quadratic, or natural logarithm functions (for quadratic, square root, and exponential functions, respectively). After the transformation, the linear model was fitted using the “lm()” function, and the *r*-squared value was calculated using the “summary(lm())$r.squared” function. The cubic splines were calculated using “lm(trait ∼ bs()),” while the smoothed splines were calculated using “smooth.spline().” Spline functions were developed in the “splines” library (R Core Team, 2015). The ANOVA analysis was performed using the “aov() function,” while the Tukey-HSD pairwise comparison test was calculated using the “HSD.test()” function from the “agricolae” library (de Mendiburu, 2017).

### Data Curation

For outlier detection, the data were first grouped based on the selected independent variables (e.g. genotype, treatment, and time). The selected independent variables were fused into one ID (e.g. “genotype_treatment_time”) and used as grouping variables to identify the outliers.

In our example, the “boxplot()$out” function was used for the outliers identified with 1.5 × IQR. For the Bonferroni test, a linear model was fitted with “lm(),” followed by “car::outlierTest()” from the “car” library (Fox and Weisberg, 2011). To identify outliers using the Cook’s distance, the linear model was fitted with “lm(),” followed by the “cooks.distance()” function from the “base” library (R CoreTeam, 2015). The samples having a Cook’s distance larger than four times the mean Cook’s distance were classified as “outliers.” To identify outliers by their ±sd from the median, the median and sd were calculated using the “summaryBy(),” “median(),” and “sd()” functions, respectively, from the “doBy” library (Hojsgaard and Halekoh, 2018) per predefined grouping variables (e.g. genotype, treatment, time). The dataset was then subsetted for predefined grouping variables, and individual values outside of median ±sd range for that specific subset were identified.

### Summary Statistics

The summary statistics were calculated using the “summaryBy()” function from the “doBy” library (Hojsgaard and Halekoh, 2018). The mean, sd, se, min, max, sum, and number of samples were calculated using the “mean(),” “median(),” “sd(),” “std.error(),” “min(),” “max(),” “sum(),” and “length()” functions, respectively, from the “doBy” (Hojsgaard and Halekoh, 2018) and “plotrix” libraries (Lemon, 2006).

### Hypothesis Testing Using Parametric and Nonparametric Tests

The normal distribution of data were tested using the Shapiro–Wilk test using the “shapiro.test()” function and the QQ-plots were produced using the “qqnorm()” and “qqline()” functions. The equal variance was tested using Bartlett and Levene tests by applying the “bartlett.test()” and “leveneTest()” functions from the “car” library (Fox and Weisberg, 2011). The equal variation was represented visually using the “hovPlot.bf()” function from the “HH” library (Heiberger, 2018). One- and two-sample *t* tests were performed using the “t.test()” function, while the Kolmogorov–Smirnov test was performed using “ks.test().” The ANOVA analysis was performed using the “aov()” function, while the Tukey-HSD pairwise comparison test was calculated using the “HSD.test()” function. The nonparametric Kruskal–Wallis test was performed using “kruskal.test(),” while the pairwise Wilcoxon/Mann-Whitney test was executed using “pairwise.wilcox.test().” The two-way ANOVA analysis was performed using the linear model “lm(),” followed by the “anova()” function. The two-way ANOVA interaction plot was produced by “interaction.plot(),” while the residual plot was produced by plotting “lm()$fitted versus lm()$residual.”

### Correlation Analysis

For our data, the correlation plot was made using the “corrplot()” function from the “corrplot” library (Wei and Simko, 2017). The correlation coefficient was calculated by the selected input method (Pearson or Spearman), and the *r*-squared and *P* values were extracted from the “rcorr()” functions. The correlation significance test was performed by “cor.mtest().”

### Reduction of Dimensionality

PCA was performed using the “PCA()” function from the “factoextra” (Kassambara and Mundt, 2017) and “FactoMineR” (Le et al., 2008) libraries. The eigenvectors were extracted by using “PCA$eig,” and the PCA contribution plot was made using the “fviz_pca_var()” function. The contribution of individual traits for each selected PC was calculated using “PCA$var$contrib,” while the plot was made using the “fviz_contrib()” function. MDS was performed using the “dist(),” “cmdscale(),” and “as_tibble() functions,” and the k-means clustering of the MDS results was performed using the “kmeans()$cluster” function. MDS of the measured traits was performed by transposing the dataset using the “t()” function and completing the MDS analysis as described above. Thus, dimensional relationships were examined among the traits rather than among the samples.

### Cluster Analysis

For hierarchical clustering in MVApp, the correlations among the samples in terms of the chosen traits were calculated using the “cor() function.” Then, the “dist() function” was used to determine the distances between the samples. The hierarchical analysis was performed on the transposed dataset using the “t()” function to determine the relationship among the selected traits with functions from the “pvclust” (Suzuki and Shimodaira, 2015) and “NbClust” (Charrad et al., 2014) libraries. The hierarchical clustering was performed using the “hclust()” function. The heatmap was produced using the “heatmap.2()” function from the “gplots” library (Warnes et al., 2016), scaled per row. The dendrogram was produced from the output of the “hclust()” function using the “plot(as.dendrogram())” function. The number of clusters was determined by cutting the dendrogram at a user-identified distance using the “cutree()” function. The significant differences among individual clusters were identified by the ANOVA and Tukey-pairwise comparison tests using the “aov()” and “glht()” functions from the “multcomp” library (Hothorn et al., 2008), followed by the “cld()” function, which allows letters indicating significant differences among the predefined groups to be integrated into a box plot.

For k-means clustering in MVApp, the optimal cluster number was calculated using different methods. The elbow plot was produced using the “fviz_nbclust(method = “ws”)” function from the “factoextra” library (Kassambara and Mundt, 2017). The silhouette plot was made using the “fviz_nbclust(method = “silhouette”)” function. To ensure that the suggested number of clusters was neither too ambitious nor too conservative, the results of 30 other indices were compared (Charrad et al., 2014), and the best number of clusters was suggested according to “majority rule” using the “NbClust(distance = “euclidean”, min.nc = 2, max.nc = 10, method = “kmeans”)” function, followed by “fviz_nbclust().” Based on these results, the user select number of the clusters and the individual samples were subsequently assigned to a k-means cluster using “kmeans().”

### Estimation of Heritability

Broad-sense heritability was calculated as

where *V _{G}* was the genetic variance and

*V*was the total phenotypic variance. The phenotypic variance can be explicitly expressed as

_{P}where *V _{GL}*,

*V*,

_{GY}*V*, and

_{GLY}*V*represent the genotype-by-location, genotype-by-year, genotype-by-location-by-year, and residual variances, respectively. The number of locations, years, and replicates (within location and year) were represented by

_{R}*l*,

*y*, and

*r*, respectively, and these values were input by the user. The variance components were estimated using the “VarCorr” function on the fitted linear mixed model “lmer(Trait ∼ 1 + (1 | Genotype) + (1 | Year) + (1 | Location) + (1 | Genotype:Location) + (1| Genotype:Year) + (1 | Genotype:Year:Location)” from the “lme4” library (Bates et al., 2015). Heritability values were rounded to two digits.

### Quantile Regression

The quantile regression models for the traits of major interest were fitted using the “rq()” function from the “quantreg” library (Koenker, 2017) for different quantile levels. The estimated values of the regression coefficients and the *P* values were extracted from the summary of the fitted model using “summary(rq())$coefficients(, “Value”)” and “summary(rq())$coefficients(, “*Pr*(> |*t*)”),” respectively.

The quantile plots (Agarwal et al., 2018) were produced by plotting the estimated regression coefficients against the quantile level using the “plot()” and “lines()” functions, and the legends were produced using the “legend()” function.

### Supplemental Data

The following supplemental information is available.

**Supplemental Figure S1.**Effect of outliers identified using*R*^{2}on the differences observed between genotypes.**Supplemental Figure S2.**Data curation assisted with visual inspection.**Supplemental Figure S3.**Hypothesis testing including assumptions for ANOVA.**Supplemental Figure S4.**Hierarchical clusters validation.**Supplemental Dataset S1.**A dataset from Awlia et al. (2016) is used to exhibit the functions of MVApp.**Supplemental Dataset S2.**List of all the measured traits used for the data analysis in Supplemental Dataset S1.**Supplemental Dataset S3.**The*R*^{2}values for different models included in MVApp, reflecting their fit to the increase in rosette area under control and salt-stress conditions.**Supplemental Dataset S4.**Outliers identified using the fit of quadratic function to rosette area in MVApp.**Supplemental Dataset S5.**List of samples identified as outliers by the 1.5 × IQR method, based on all traits.**Supplemental Dataset S6.**Correlation between individual traits measured for plants grown under salt-stress and control conditions.**Supplemental Dataset S7.**Eigen values PCA using all measured traits and curated dataset.**Supplemental Dataset S8.**Contributions (%) of the individual measured traits to the PCs under control and salt-stress conditions.**Supplemental Dataset S9.**Clustering of the phenotypes of nine Arabidopsis accessions using hierarchical clustering with the Ward method, and Area,*F*, Lss4, and NPQ Lss4 as the major determinants._{v}F_{m}**Supplemental Dataset S10.**Quantile regression of various traits of major interest, calculated using MVApp.**Supplemental Dataset S11.**Broad-sense heritability estimations calculated for all individual traits by day of experiment and treatment for nine Arabidopsis accessions with an average of nine biological replicates.

## Acknowledgments

Figure 9 was produced by Ivan Gromicho, scientific illustrator at King Abdullah University of Science and Technology (KAUST). We thank Antonio Arena from Research Computing at KAUST for his help with putting MVApp on the server and making it accessible online; KAUST IT Linux Systems Team who provided the infrastructure for the online hosting of MVApp; and Veronica Tremblay, scientific editor at KAUST, for editing the article. Additionally, we thank Dr. Guillaume Lobet (Louvain/Jurlich University), Dr. Sandra Schmöckel and Dr. Boubacar Kountche (KAUST), Prof. Julia Bailey-Serrez (University of California-Riverside), and Dr. Nazgol Emrani (Kiel University), for their helpful comments on the MVApp design and functionality.

## Footnotes

- Received February 25, 2019.
- Accepted April 30, 2019.
- Published May 6, 2019.