Variance estimation for nearest neighbor imputation for US Census long form data
Kim, Jae Kwang
Variance estimation for estimators of state, county, and school district quantities derived from the Census 2000 long form are discussed. The variance estimator must account for (1) uncertainty due to imputation, and (2) raking to census population controls. An imputation procedure that imputes more than one value for each missing item using donors that are neighbors is described and the procedure using two nearest neighbors is applied to the Census long form. The Kim and Fuller [Biometrika 91 (2004) 559–578] method for variance estimation under fractional hot deck imputation is adapted for application to the long form data. Numerical results from the 2000 long form data are presented.
This article is published as Kim, Jae Kwang; Fuller, Wayne A.; Bell, William R. Variance estimation for nearest neighbor imputation for US Census long form data. Annals of Applied Statistics 5 (2011), no. 2A, 824--842. doi:10.1214/10-AOAS419. Posted with permission.