A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection

Khalilzadeh, Zahra; Kashanian, Motahareh; Khaki, Saeed; Wang, Lizhi

A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection

dc.contributor.author	Khalilzadeh, Zahra
dc.contributor.author	Kashanian, Motahareh
dc.contributor.author	Khaki, Saeed
dc.contributor.author	Wang, Lizhi
dc.contributor.department	Department of Industrial and Manufacturing Systems Engineering
dc.date.accessioned	2023-10-25T20:09:21Z
dc.date.available	2023-10-25T20:09:21Z
dc.date.issued	2023-09-25
dc.description.abstract	Accurately predicting crop yield is vital for enhancing agricultural breeding and ensuring crop production remains resilient in diverse climatic conditions. Integrating weather data throughout the crop growing season, especially for various genotypes, is crucial for these predictions. It represents a significant stride in comprehending how climate change affects a variety’s adaptability. In the MLCAS2021 Crop Yield Prediction Challenge, the Third International Workshop on Machine Learning for Cyber-Agricultural Systems released a dataset for soybean hybrids consisting of 93,028 training performance records to predict yield for the 10,337 testing performance records. This dataset spanned 159 locations across 28 states in the U.S. and Canadian provinces over a 13-year period, from 2003 to 2015. It comprised details on 5838 distinct genotypes and daily weather data for a 214-day growing season, encompassing all possible location and year combinations. As one of the winning teams, we designed two novel convolutional neural network (CNN) architectures. The first proposed model combines CNN and fully-connected (FC) neural networks (CNN-DNN model). The second proposed model adds an LSTM layer at the end of the CNN part for the weather variables (CNN LSTM-DNN model). The Generalized Ensemble Method (GEM) was then utilized to determine the optimal weights of the proposed CNN-based models to achieve higher accuracy than other baseline models. The GEM model we introduced demonstrated superior performance compared to all other baseline models employed in soybean yield prediction. It exhibited a lower RMSE ranging from 5.55% to 39.88%, a reduced MAE ranging from 5.34% to 43.76%, and a higher correlation coefficient ranging from 1.1% to 10.79% in comparison to the baseline models when evaluated on test data. The proposed CNN-DNN model was then employed to identify the best-performing genotypes for various locations and weather conditions, making yield predictions for all potential genotypes in each specific setting. The dataset provides unique genotype information on seeds, allowing investigation of the potential of planting genotypes based on weather variables. The proposed data-driven approach can be valuable for genotype selection in scenarios with limited testing years. We also performed a feature importance analysis utilizing Root Mean Square Error (RMSE) change to identify crucial predictors impacting our model’s predictions. The location variable exhibited the highest RMSE change, emphasizing its pivotal role in predictions, followed by MG, year, and genotype, showcasing their significance during crop growth stages and across different years. In the weather category, MDNI and AP displayed higher RMSE changes, indicating their importance.
dc.description.comments	This preprint is from Khalilzadeh, Zahra, Motahareh Kashanian, Saeed Khaki, and Lizhi Wang. "A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection." arXiv preprint arXiv:2309.13021 (2023). doi: https://doi.org/10.48550/arXiv.2309.13021. CC BY 4.0 DEED. Copyright 2023, The Authors.
dc.identifier.uri	https://dr.lib.iastate.edu/handle/20.500.12876/JvNVD94v
dc.language.iso	en
dc.publisher	Arxiv
dc.source.uri	https://doi.org/10.48550/arXiv.2309.13021	*
dc.subject.disciplines	DegreeDisciplines::Engineering::Bioresource and Agricultural Engineering
dc.subject.keywords	Convolutional Neural Network
dc.subject.keywords	Genotype Selection
dc.subject.keywords	Crop Yield Prediction
dc.subject.keywords	Generalized Ensemble Method
dc.title	A Hybrid Deep Learning-based Approach for Optimal Genotype by Environment Selection
dc.type	Preprint
dspace.entity.type	Publication
relation.isAuthorOfPublication	8fecdf1e-7b86-41d4-acd4-ec8611237be3
relation.isOrgUnitOfPublication	51d8b1a0-5b93-4ee8-990a-a0e04d3501b1

File

Original bundle

Now showing 1 - 1 of 1

Name:: 2023-Wang-AHybridDeep.pdf
Size:: 2.02 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Publications