Genomic Selection with Deep Neural Networks
Date
Authors
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Abstract
Reduced costs for DNA marker technology has generated a huge amount of molecular
data and made it economically feasible to generate dense genome-wide marker maps of lines
in a breeding program. Increased data density and volume has driven an exploration of
tools and techniques to analyze these data for cultivar improvement. Data science theory
and application has experienced a resurgence of research into techniques to detect or ”learn”
patterns in noisy data in a variety of technical applications. Several variants of machine
learning have been proposed for analyzing large DNA marker data sets to aid in pheno-
type prediction and genomic selection. Here, we present a review of the genomic prediction
and machine learning literature. We apply deep learning techniques from machine learn-
ing research to six phenotypic prediction tasks using published reference datasets. Because
regularization frequently improves neural network prediction accuracy, we included regular-
ization methods in the neural network models. The neural network models are compared to
a selection of regularized Bayesian and linear regression techniques commonly employed for
phenotypic prediction and genomic selection. On three of the phenotype prediction tasks,
regularized neural networks were the most accurate of the models evaluated. Surprisingly,
for these data sets the depth of the network architecture did not affect the accuracy of the
trained model. We also find that concerns about the computer processing time needed to
train neural network models to perform well in genomic prediction tasks may not apply when
Graphics Processing Units are used for model training.