Exploring the Information in p‐Values for the Analysis and Planning of Multiple‐Test Experiments

dc.contributor.author Ruppert, David
dc.contributor.author Nettleton, Dan
dc.contributor.author Hwang, J.T. Gene
dc.contributor.department Statistics
dc.date 2019-09-12T01:21:19.000
dc.date.accessioned 2020-07-02T06:57:23Z
dc.date.available 2020-07-02T06:57:23Z
dc.date.copyright Sun Jan 01 00:00:00 UTC 2006
dc.date.issued 2007-06-01
dc.description.abstract <p>A new methodology is proposed for estimating the proportion of true null hypotheses in a large collection of tests. Each test concerns a single parameter δ whose value is specified by the null hypothesis. We combine a parametric model for the conditional cumulative distribution function (CDF) of the <em>p</em>‐value given δ with a nonparametric spline model for the density <em>g</em>(δ) of δ under the alternative hypothesis. The proportion of true null hypotheses and the coefficients in the spline model are estimated by penalized least squares subject to constraints that guarantee that the spline is a density. The estimator is computed efficiently using quadratic programming. Our methodology produces an estimate of the density of δ when the null is false and can address such questions as “when the null is false, is the parameter usually close to the null or far away?” This leads us to define a falsely interesting discovery rate (FIDR), a generalization of the false discovery rate. We contrast the FIDR approach to <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/j.1541-0420.2006.00704.x#b6" id="x-x-#b6R">Efron's</a>(2004, <em>Journal of the American Statistical Association</em>99, 96–104) empirical null hypothesis technique. We discuss the use of in sample size calculations based on the expected discovery rate (EDR). Our recommended estimator of the proportion of true nulls has less bias compared to estimators based upon the marginal density of the <em>p</em>‐values at 1. In a simulation study, we compare our estimators to the convex, decreasing estimator of <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/j.1541-0420.2006.00704.x#b12" id="x-x-#b12R">Langaas, Lindqvist, and Ferkingstad</a> (2005, <em>Journal of the Royal Statistical Society, Series B</em>67, 555–572). The most biased of our estimators is very similar in performance to the convex, decreasing estimator. As an illustration, we analyze differences in gene expression between resistant and susceptible strains of barley.</p>
dc.description.comments <p>This is a manuscript of an article published as Ruppert, David, Dan Nettleton, and JT Gene Hwang. "Exploring the information in p‐values for the analysis and planning of multiple‐test experiments." <em>Biometrics</em> 63, no. 2 (2007): 483-495. doi: <a href="https://doi.org/10.1111/j.1541-0420.2006.00704.x">10.1111/j.1541-0420.2006.00704.x</a>. Posted with permission.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/stat_las_pubs/239/
dc.identifier.articleid 1249
dc.identifier.contextkey 14913213
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath stat_las_pubs/239
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/90553
dc.language.iso en
dc.source.uri https://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=1092&context=stat_las_preprints
dc.subject.disciplines Design of Experiments and Sample Surveys
dc.subject.disciplines Microarrays
dc.subject.disciplines Statistical Methodology
dc.subject.disciplines Statistical Models
dc.title Exploring the Information in p‐Values for the Analysis and Planning of Multiple‐Test Experiments
dc.type article
dc.type.genre article
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca