Rank-based inference for conditional independence and its applications

Thumbnail Image
Hu, Haoyan
Major Professor
Qiu, Yumou
Kaiser, Mark
Liu, Peng
Wang, Lily
Zhu, Zhengyuan
Committee Member
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Journal Issue
Is Version Of
This dissertation is made up of three research projects with the focus on rank-based inference for conditional independence and its applications in missing data and Markov random fields. In the first project, we propose a statistical inference procedure for partial correlations under the high-dimensional nonparanormal (NPN) model which assumes the observed data follows normal distribution after marginal monotone transformations. The nonparanormal partial correlation is the partial correlation of the normal transformed data under the NPN model, which is a more general measure of conditional dependence. It is estimated through regularized rank-based nodewise regression (RRNR) and a multiple testing procedure is proposed to recover the nonzero NPN partial correlations. We develop theoretical proofs which give the asymptotic normality of the proposed estimator and justify the proposed multiple testing procedure. Numerical simulations and two real data applications are presented to demonstrate the performance of RRNR procedure. The second project extends the availability and flexibility of the RRNR procedure in handling missing completely at random (MCAR) data, which leads to the proposed mRRNR estimators. These newly proposed estimators are motivated by the increasing occurrence of missingness in data sets of high dimensions. The mRRNR estimators relax the Gaussian assumptions and incorporate missing data. A multiple testing procedure adaptive to missing values is proposed to recover the conditional independence graph. We compare the performance of proposed estimators with numerical simulations and demonstrate their potentials by real data application in a news data set. In the third project, we explore the capability of the RRNR procedure in recovering conditional independence structure in Markov random fields (MRF). We assume the MRF to be formulated through node-conditional distributions from one-parameter exponential family. The semiparametric transformations in RRNR allow it to be robust to the non-Gaussian MRFs. Meanwhile, we modify the RRNR method to reveal its capability in identifying eight-nearest neighborhood against four-nearest neighborhood structures, which is also insensitive to the original data distributions. Massive simulations are carried out to demonstrate the application of RRNR in Markov random fields.
Subject Categories