Application of Optimization and Simulation Models in Genomic Prediction and Genomic Selection

Thumbnail Image
Amini, Fatemeh
Major Professor
Hu, Guiping
Committee Member
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Journal Issue
Is Version Of
Population growth, climate change, and biofuel consumption for agricultural products have been estimated to be doubled by 2050. Due to population growth, the agricultural production system has to become ever efficient and robust to ensure the food security. To address this challenge, crop improvement and plant breeding process have to be employed to enhance the quality and quantity of crop productions. In this dissertation, we adopt simulation and data analytics methods to tackle a few challenges in the crop improvement and breeding process. First of all, we address the high-dimensionality issue in the genetic data and introduce a novel two-layer feature selection method to reduce the feature space dimension while improving the genetic prediction accuracy in genomic selection algorithms. Furthermore, we design a realistic simulator that can be adopted to simulate the breeding process where the goal is to imitate the uncertainty of nature to provide reliable genetic outcome. Moreover, from the decision making perspective, we propose a new selection strategy called, look ahead trace back selection, that aims at improving the performance of a single trait at the end of breeding cycle. Additionally, a new optimization model is introduced to maximize the performances of multiple traits simultaneously. Multiple challenges in the breeding process make its improvement more difficult. Two of these challenges, namely, improving prediction accuracy in genomic prediction, and uncertainties due to the recombination events in the mating process are addressed. To address the first challenge, we tune the hyper-parameters of the adopted prediction methods inside cross-validation loops and the proposed two-layer feature selection parameters constrained by the available computational capacity. To address the second challenge, we develop a comprehensive simulation platform in which multiple simulation runs are conducted independently to ensure the robustness of the results of any proposed approaches in comparison with the conventional methods. This dissertation includes 5 chapters in which chapter \ref{chapter 1} presents a general overview along with the problem statements and a summary of contributions. In chapter \ref{chapter 2}, we develop a two-layer feature selection, a hybrid of wrapper-embedded method to reduce the feature dimension in genomic prediction while maintaining/improving the prediction accuracy. In chapter \ref{chapter 3}, we design look ahead trace back selection algorithm that improve the genetic gain in the breeding process. Moreover, a realistic opaque simulator is introduced in this chapter which accounts for nature uncertainties. In chapter \ref{chapter 4}, an L-shaped selection algorithm is proposed to improve the genetic gain in multi-trait genomic selection. The aim of this algorithm is to maximize multiple traits at the same time by capturing all Pareto optimal individuals and maintain the population diversity. In chapter \ref{chapter 5}, we analyze the performance of integrating the proposed two-layer feature selection in improving the genetic gain in different genomic selection algorithms. Moreover, a comprehensive comparison framework has been formulated that can integrate different prediction methods, multiple genomic selection algorithms and different simulation methods, such as transparent and opaque simulator.
Industrial engineering, Genomic Prediction, Genomic Selection, L-shaped Selection, Multi-trait Genomic Selection, Opaque Simulator