Weighting in survey analysis under informative sampling
Kim, Jae Kwang
Sampling related to the outcome variable of a regression analysis conditional on covariates is called informative sampling and may lead to bias in ordinary least squares estimation. Weighting by the reciprocal of the inclusion probability approximately removes such bias but may inflate variance. This paper investigates two ways of modifying such weights to improve efficiency while retaining consistency. One approach is to multiply the inverse probability weights by functions of the covariates. The second is to smooth the weights given values of the outcome variable and covariates. Optimal ways of constructing weights by these two approaches are explored. Both approaches require the fitting of auxiliary weight models. The asymptotic properties of the resulting estimators are investigated and linearization variance estimators are obtained. The approach is extended to pseudo maximum likelihood estimation for generalized linear models. The properties of the different weighted estimators are compared in a limited simulation study. The robustness of the estimators to misspecification of the auxiliary weight model or of the regression model of interest is discussed.
This is a manuscript of an article published as Kim, Jae Kwang, and Chris J. Skinner. "Weighting in survey analysis under informative sampling." Biometrika 100, no. 2 (2013): 385-398. doi:10.1093/biomet/ass085. Posted with permission.