New statistical methods in bioinformatics: for the analysis of quantitative trait loci (QTL), microarrays, and eQTLs
This thesis focuses on new statistical methods in the area of bioinformatics which uses computers and statistics to solve biological problems. The first study discusses a method for detecting a quantitative trait locus (QTL) when the trait of interest has a zero-inflated Poisson (ZIP) distribution. Though existing methods based on normality may be reasonably applied to some ZIP distributions, the characteristics of other ZIP distributions make such an application inappropriate. We compare our method to an existing non-parametric approach, and we illustrate our method using QTL data collected on two ecotypes of the Arabidopsis thaliana plant where the trait of interest is shoot count;The second study discusses a method to detect differentially expressed genes in an unreplicated multiple-treatment microarray timecourse experiment. In a two-sample setting, differential expression is well defined as non-equal means, but in the present setting, there are numerous expression patterns that may qualify as differential expression, and that may be of interest to the researcher. This method provides the researcher with a list of significant genes, an associated false discovery rate for that list, and a 'best model' choice for every gene. The model choice component is relevant because the alternative hypothesis of differential expression does not dictate one specific alternative expression pattern. In fact, in this type of experiment, there are many possible expression patterns of interest to the researcher. Using simulations, we provide information on the specificity and sensitivity of detection under a variety of true expression patterns using receiver operating characteristic curves. The method is illustrated using an Arabidopsis thaliana microarray experiment with five time points and three treatment groups;The third study discusses a new type of analysis, called eQTL analysis. This analysis brings together the methods of microarray and QTL analyses in order to detect locations on the genome that control gene expression. These controlling loci are called expression QTL, or eQTL. Locating eQTL can help researchers uncover complex networks in biological systems. The method is illustrated using an Arabidopsis thaliana eQTL experiment with 22,787 genes and 288 markers.