TY - JOUR
T1 - New multiple imputation methods for genotype-by-environment data that combine singular value decomposition and Jackknife resampling or weighting schemes
AU - Arciniegas-Alarcón, Sergio
AU - García-Peña, Marisol
AU - Canas Rodrigues, Paulo
N1 - Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/9
Y1 - 2020/9
N2 - Missing data is a common phenomenon in agronomy and many other fields of research. Data imputation, in which the missing elements of a data matrix are replaced by plausible values, is one possible way to tackle this problem. In this paper, we consider the case of two-way data tables, e.g. phenotypic traits observed in multi-location plant trials with genotypes in the rows and environments in the columns. Two new methodologies for multiple imputation in genotype-by-environment interaction data tables, and in general two-way data tables, that combine singular value decomposition and either jackknife resampling or weighting strategies, are proposed. The proposed methods are compared with competing methods available in the literature for data imputation, by considering Monte Carlo simulations and a real data application. Two-way data tables with a given main effects and interaction structure are simulated and different percentages of observations are removed in order to obtain the three widely used missing data mechanisms: missing at random, missing completely at random, and missing not at random. The imputation methods under consideration are then applied to the incomplete two-way-data tables and comparisons are made via prediction errors and variances between imputations. The best results were obtained by the proposed multiple imputation weighted versions of the eigenvector method, which outperformed the classical method in all the considered scenarios.
AB - Missing data is a common phenomenon in agronomy and many other fields of research. Data imputation, in which the missing elements of a data matrix are replaced by plausible values, is one possible way to tackle this problem. In this paper, we consider the case of two-way data tables, e.g. phenotypic traits observed in multi-location plant trials with genotypes in the rows and environments in the columns. Two new methodologies for multiple imputation in genotype-by-environment interaction data tables, and in general two-way data tables, that combine singular value decomposition and either jackknife resampling or weighting strategies, are proposed. The proposed methods are compared with competing methods available in the literature for data imputation, by considering Monte Carlo simulations and a real data application. Two-way data tables with a given main effects and interaction structure are simulated and different percentages of observations are removed in order to obtain the three widely used missing data mechanisms: missing at random, missing completely at random, and missing not at random. The imputation methods under consideration are then applied to the incomplete two-way-data tables and comparisons are made via prediction errors and variances between imputations. The best results were obtained by the proposed multiple imputation weighted versions of the eigenvector method, which outperformed the classical method in all the considered scenarios.
KW - Additive main effects with multiplicative interaction model
KW - Jackknife resampling
KW - Missing values
KW - Multiple imputation
KW - Singular value decomposition
UR - http://www.scopus.com/inward/record.url?scp=85088153777&partnerID=8YFLogxK
U2 - 10.1016/j.compag.2020.105617
DO - 10.1016/j.compag.2020.105617
M3 - Article
AN - SCOPUS:85088153777
SN - 0168-1699
VL - 176
JO - Computers and Electronics in Agriculture
JF - Computers and Electronics in Agriculture
M1 - 105617
ER -