Exploratory analysis of high-dimensional data with visual tools

dc.contributor.advisor Hofmann, Heike
dc.contributor.advisor Berg, Emily
dc.contributor.advisor Carriquiry, Alicia
dc.contributor.advisor Ommen, Danica
dc.contributor.advisor Olafsson, Sigurdur
dc.contributor.author Jeppson, Haley
dc.contributor.department Statistics (LAS)
dc.date.accessioned 2022-11-09T02:34:14Z
dc.date.available 2022-11-09T02:34:14Z
dc.date.issued 2021-12
dc.date.updated 2022-11-09T02:34:14Z
dc.description.abstract This body of work combines statistical models and data visualizations in new and exciting ways. Exploratory data analysis and dimension reduction techniques serve as the backbone for developing visual tools designed to foster new ideas and an understanding of the underlying phenomenon of interest. Chapter 1 presents a thorough review of the literature regarding exploratory data analysis and graphical methods. Chapter 2 focuses on graphical methods for categorical variables and introduces an implementation of mosaic plots in R designed for generalized mosaic plots using the popular grammar of graphics R plotting paradigm, ggplot2 (Wickham 2016). We develop novel uses of mosaic plots that exemplify the capacity multidimensional categorical data visualization methods have for growth. We conclude with a Shiny application that facilitates a better understanding of the myriad of possible forms a mosaic plot can take by accommodating a thorough search through the variables and structural changes to the mosaic plot with the simple press of certain keystrokes. Chapter 3 further explores multidimensional categorical data visualizations and develops an approach to using mosaic plots to assess the lack of fit of a given ordinal model. We identify visual indicators for parameters in different models and extend the connection between mosaic plots of binary tables and odds ratios to include logistic regression models with ordinal variables. We then extended the concept to an ordinal response variable, requiring the introduction of cumulative odds ratios and the proportional odds ratio model, with which we address the assessment of higher dimensional interaction terms. The second part of chapter 3 develops techniques for visual model diagnostics and selection with mosaic plots resulting in a graphical forward step-wise selection procedure. We connect the model space and the data space by representing model residuals with jittered points on the mosaic plot, extending methods suggested by Theus and Lauer (1999) and Friendly (2002). We conclude by amending the procedure to include a second phase consisting of backward steps to tighten the model constraints by replacing some parameters with structured terms for ordinal classifications in association models. Chapter 4 switches focus to the exploratory data analyses of high-dimensional numeric data using tours. We present a new type of tour that incorporates novelty detection, providing a versatile approach to unsupervised dimension reduction. The method fuses multiple projection pursuit indices while maintaining a short memory, circumventing the issues encountered with the randomness of the grand tour and the narrow view of the projection pursuit optimization. We evaluate the behavior of the method with simulated data, and the results highlight the flexibility of the method. This work expands visual methods for exploring diverse types of high-dimensional data.
dc.format.mimetype PDF
dc.identifier.doi https://doi.org/10.31274/td-20240329-324
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/Nveolx8z
dc.language.iso en
dc.language.rfc3066 en
dc.subject.disciplines Statistics en_US
dc.subject.keywords Exploratory data analysis en_US
dc.subject.keywords Interactive graphics en_US
dc.subject.keywords Ordinal models en_US
dc.subject.keywords Statistical graphics en_US
dc.title Exploratory analysis of high-dimensional data with visual tools
dc.type dissertation en_US
dc.type.genre dissertation en_US
dspace.entity.type Publication
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
thesis.degree.discipline Statistics en_US
thesis.degree.grantor Iowa State University en_US
thesis.degree.level dissertation $
thesis.degree.name Doctor of Philosophy en_US
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Jeppson_iastate_0097E_19845.pdf
Size:
4.88 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
0 B
Format:
Item-specific license agreed upon to submission
Description: