Model estimation, identification and inference for next-generation functional data and spatial data

dc.contributor.advisor Dan Nettleton
dc.contributor.author Yu, Shan
dc.contributor.department Statistics
dc.date 2020-09-23T19:12:41.000
dc.date.accessioned 2021-02-25T21:37:13Z
dc.date.available 2021-02-25T21:37:13Z
dc.date.copyright Sat Aug 01 00:00:00 UTC 2020
dc.date.embargo 2020-09-10
dc.date.issued 2020-01-01
dc.description.abstract <p>This dissertation is composed of three research projects focused on model estimation, identification, and inference for next-generation functional data and spatial data.</p> <p>The first project deals with data that are collected on a count or binary response with spatial covariate information. In this project, we introduce a new class of generalized geoadditive models (GGAMs) for spatial data distributed over complex domains. Through a link function, the proposed GGAM assumes that the mean of the discrete response variable depends on additive univariate functions of explanatory variables and a bivariate function to adjust for the spatial effect. We propose a two-stage approach for estimating and making inferences of the components in the GGAM. In the first stage, the univariate components and the geographical component in the model are approximated via univariate polynomial splines and bivariate penalized splines over triangulation, respectively. In the second stage, local polynomial smoothing is applied to the cleaned univariate data to average out the variation of the first-stage estimators. We investigate the consistency of the proposed estimators and the asymptotic normality of the univariate components. We also establish the simultaneous confidence band for each of the univariate components. The performance of the proposed method is evaluated by two simulation studies and the crash counts data in the Tampa-St. Petersburg urbanized area in Florida.</p> <p>In the second project, motivated by recent work of analyzing data in the biomedical imaging studies, we consider a class of image-on-scalar regression models for imaging responses and scalar predictors. We propose to use flexible multivariate splines over triangulations to handle the irregular domain of the objects of interest on the images and other characteristics of images. The proposed estimators of the coefficient functions are proved to be root-$n$ consistent and asymptotically normal under some regularity conditions. We also provide a consistent and computationally efficient estimator of the covariance function. Asymptotic pointwise confidence intervals (PCIs) and data-driven simultaneous confidence corridors (SCCs) for the coefficient functions are constructed. A highly efficient and scalable estimation algorithm is developed. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed method. The proposed method is applied to the spatially normalized Positron Emission Tomography (PET) data of Alzheimer's Disease Neuroimaging Initiative (ADNI).</p> <p>In the third project, we propose a heterogeneous functional linear model to simultaneously estimate multiple coefficient functions and identify groups, such that coefficient functions are identical within groups and distinct across groups. By borrowing information from relevant subgroups, our method enhances estimation efficiency while preserving heterogeneity. We use an adaptive fused lasso penalty to shrink subgroup coefficients to shared common values within each group. We also establish the theoretical properties of our adaptive fused lasso estimators. To enhance the computation efficiency and incorporate neighborhood information, we propose to use a graph-constrained adaptive lasso. A highly efficient and scalable estimation algorithm is developed. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed method. The proposed method is applied to a dataset of hybrid maize grain yields from the Genomes to Fields consortium.</p>
dc.format.mimetype application/pdf
dc.identifier archive/lib.dr.iastate.edu/etd/18251/
dc.identifier.articleid 9258
dc.identifier.contextkey 19236853
dc.identifier.doi https://doi.org/10.31274/etd-20200902-170
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath etd/18251
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/94403
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/etd/18251/Yu_iastate_0097E_18958.pdf|||Fri Jan 14 21:39:12 UTC 2022
dc.subject.keywords Brain image analysis
dc.subject.keywords Functional data analysis
dc.subject.keywords Genotype-by-environment interaction study
dc.subject.keywords Multivariate spline approximation
dc.subject.keywords Spatial data analysis
dc.subject.keywords Subgroup
dc.title Model estimation, identification and inference for next-generation functional data and spatial data
dc.type article
dc.type.genre dissertation
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
thesis.degree.discipline Statistics
thesis.degree.level dissertation
thesis.degree.name Doctor of Philosophy
File
Original bundle
Now showing 1 - 1 of 1
Name:
Yu_iastate_0097E_18958.pdf
Size:
13.47 MB
Format:
Adobe Portable Document Format
Description: