Statistical methods for conditional dependence and microbiome data analysis
Date
2022-08
Authors
Zhao, Wenting
Major Professor
Advisor
Liu, Peng
Qiu, Yumou
Wang, Chong
Hofmann, Heike
Chu, Lynna
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This dissertation consists of three research projects for microbiome data analysis and conditional dependence. Chapter 2 is about the analysis of microbiome data to rank differentially abundant features. Chapter 3-4 is to study the conditional dependence between bivariate outcomes in presence of covariate variables.
Microbiome data have been widely collected to study plant, animal, and human health. Identifying microbial features that are differentially abundant (DA) is a fundamental step of microbiome data analysis. In some studies, hundreds to thousands of microbial features are declared to be significant, and there is a need to prioritize these microbial features in the downstream analysis. Ranking DA features offers more insights rather than a binary outcome of DA test. In Chapter 2, we propose hierarchical Bayesian approaches based on zero-inflated Poisson (ZIP) models to rank DA features for microbiome data analysis. The ZIP model takes into account excessive zeros in microbiome data and Bayesian approaches allow borrowing information across features. We propose two methods ZIP-PM and ZIP-RPM that prioritize the log fold change and a ranking statistic, respectively. Our methods perform favorably compared to existing methods for ranking the DA features in both simulation studies and real data analysis. In addition, our methods perform well in simulations even when the data are not simulated according to our model.
Accessing the association between bivariate outcomes is often of scientific interest. In the presence of covariates, conditional associations after adjusting for the impacts of covariables provides insights on direct relationship between bivariate outcomes and a more intrinsic measure of associations. In Chapter 3, we propose a flexible and robust association test to study the conditional dependence between bivariate outcomes. We adjust the traditional Kendall's tau coefficient by inverse probability weighting with pairwise propensity scores, and propose a statistic that measures conditional associations in the form of U-statistic and can be viewed as an average conditional Kendall's tau. The pairwise propensity scores in our statistic can be estimated flexibly by both parametric and non-parametric methods. Our new measure of conditional association has zero expectation under conditional independence, as long as either one of the propensity score models is correctly specified. In addition, we show that our proposed statistic is asymptotically normal under the null hypothesis, which provides a valid test if models for both propensity scores are correct and if the estimates of propensity scores are consistent. Simulation studies show that our test controls Type I error rate and has competitive power. Furthermore, our test is more robust to model misspecifications than other existing methods.
In Chapter 4, we extend the idea of Chapter 3 to investigate conditional dependence between bivariate outcomes. We propose a quadruply robust association test by combining the PIP-tau method in Chapter 3 with the Kendall's tau correlation coefficient on the residuals to employ information from both propensity scores and outcome models. The new measure is quadruply robust because it has zero expectation under conditional independence, as long as either one of the propensity score models, or either one of the outcome models is correctly specified. Our proposed test statistic is shown to be asymptotically normal under the null hypothesis, and a valid test can be provided when models for both propensity scores or models for both outcomes are correctly specified. We present the performance of our proposed test in a series of simulation studies and show that our test controls Type I error rate and is powerful and is robust.
Series Number
Journal Issue
Is Version Of
Versions
Series
Academic or Administrative Unit
Type
dissertation