Response errors in repeated surveys with duplicated observations

Chua, Tin
Major Professor
Committee Member
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Organizational Unit
Journal Issue

The analysis of response error models for categorical data that form an r x r contingency table is considered. Individuals are placed in the row and column classes on the basis of two interviews. It is assumed that the errors in the row and in the column classifications are independent. It is also assumed that the error in the classification of an individual depends only on the individual's true class. A parametric model for the probability that an individual belonging to the i-th class is classified in the j-th class is proposed;Reinterview on one of the dimensions is conducted in order to estimate the classification probabilities. Two kinds of reinterview procedures are performed by the U.S. Bureau of the Census in the Current Population Survey. In the first kind, the reinterviewers are not given the original responses. In the second kind, the original responses are given to the reinterviewers and a reconciliation is made after the responses are collected in the reinterview. The Gauss-Newton procedure for the nonlinear model is used to estimate the parameters of the classification model from data collected in the three interviews;The determination of the optimal number of replicates to observe for the estimation of the simple errors-in-variables model is considered. It is assumed that the cost of obtaining an observation is the same for every unit. For a fixed total cost, the optimal ratio of the number of units with duplicated observations to the total number of units is obtained by minimizing the variance of the estimator of the slope in the simple linear errors-in-variables regression model. Extension of replicated designs to three observations per unit is considered under the condition that all the units in the sample are observed twice. Tables of optimal designs are constructed for some specific values of the parameters of the model. The optimal design for the case where the observed values are dichotomous is also considered.