A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection

dc.contributor.author Khalilzadeh, Zahra
dc.contributor.author Kashanian, Motahareh
dc.contributor.author Khaki, Saeed
dc.contributor.author Wang, Lizhi
dc.contributor.department Department of Industrial and Manufacturing Systems Engineering
dc.date.accessioned 2023-10-25T20:09:21Z
dc.date.available 2023-10-25T20:09:21Z
dc.date.issued 2023-09-25
dc.description.abstract Accurately predicting crop yield is vital for enhancing agricultural breeding and ensuring crop production remains resilient in diverse climatic conditions. Integrating weather data throughout the crop growing season, especially for various genotypes, is crucial for these predictions. It represents a significant stride in comprehending how climate change affects a variety’s adaptability. In the MLCAS2021 Crop Yield Prediction Challenge, the Third International Workshop on Machine Learning for Cyber-Agricultural Systems released a dataset for soybean hybrids consisting of 93,028 training performance records to predict yield for the 10,337 testing performance records. This dataset spanned 159 locations across 28 states in the U.S. and Canadian provinces over a 13-year period, from 2003 to 2015. It comprised details on 5838 distinct genotypes and daily weather data for a 214-day growing season, encompassing all possible location and year combinations. As one of the winning teams, we designed two novel convolutional neural network (CNN) architectures. The first proposed model combines CNN and fully-connected (FC) neural networks (CNN-DNN model). The second proposed model adds an LSTM layer at the end of the CNN part for the weather variables (CNN LSTM-DNN model). The Generalized Ensemble Method (GEM) was then utilized to determine the optimal weights of the proposed CNN-based models to achieve higher accuracy than other baseline models. The GEM model we introduced demonstrated superior performance compared to all other baseline models employed in soybean yield prediction. It exhibited a lower RMSE ranging from 5.55% to 39.88%, a reduced MAE ranging from 5.34% to 43.76%, and a higher correlation coefficient ranging from 1.1% to 10.79% in comparison to the baseline models when evaluated on test data. The proposed CNN-DNN model was then employed to identify the best-performing genotypes for various locations and weather conditions, making yield predictions for all potential genotypes in each specific setting. The dataset provides unique genotype information on seeds, allowing investigation of the potential of planting genotypes based on weather variables. The proposed data-driven approach can be valuable for genotype selection in scenarios with limited testing years. We also performed a feature importance analysis utilizing Root Mean Square Error (RMSE) change to identify crucial predictors impacting our model’s predictions. The location variable exhibited the highest RMSE change, emphasizing its pivotal role in predictions, followed by MG, year, and genotype, showcasing their significance during crop growth stages and across different years. In the weather category, MDNI and AP displayed higher RMSE changes, indicating their importance.
dc.description.comments This preprint is from Khalilzadeh, Zahra, Motahareh Kashanian, Saeed Khaki, and Lizhi Wang. "A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection." arXiv preprint arXiv:2309.13021 (2023). doi: https://doi.org/10.48550/arXiv.2309.13021. CC BY 4.0 DEED. Copyright 2023, The Authors.
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/JvNVD94v
dc.language.iso en
dc.publisher Arxiv
dc.source.uri https://doi.org/10.48550/arXiv.2309.13021 *
dc.subject.disciplines DegreeDisciplines::Engineering::Bioresource and Agricultural Engineering
dc.subject.keywords Convolutional Neural Network
dc.subject.keywords Genotype Selection
dc.subject.keywords Crop Yield Prediction
dc.subject.keywords Generalized Ensemble Method
dc.title A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection
dc.type Preprint
dspace.entity.type Publication
relation.isAuthorOfPublication 8fecdf1e-7b86-41d4-acd4-ec8611237be3
relation.isOrgUnitOfPublication 51d8b1a0-5b93-4ee8-990a-a0e04d3501b1
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2023-Wang-AHybridDeep.pdf
Size:
2.02 MB
Format:
Adobe Portable Document Format
Description:
Collections