Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model
Kim, Jae Kwang
Kim, Jae Kwang
Combining information from different sources is an important practical problem in survey sampling. Using a hierarchical area-level model, we establish a framework to integrate auxiliary information to improve state-level area estimates. The best predictors are obtained by the conditional expectations of latent variables given observations, and an estimate of the mean squared prediction error is discussed. Sponsored by the National Agricultural Statistics Service of the US Department of Agriculture, the proposed model is applied to the planted crop acreage estimation problem by combining information from three sources, including the June Area Survey obtained by a probability-based sampling of lands, administrative data about the planted acreage and the cropland data layer, which is a commodity-specific classification product derived from remote sensing data. The proposed model combines the available information at a sub-state level called the agricultural statistics district and aggregates to improve state-level estimates of planted acreages for different crops. Supplementary materials accompanying this paper appear on-line.
This article is published as Kim, Jae Kwang, Zhonglei Wang, Zhengyuan Zhu, and Nathan B. Cruze. "Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model." Journal of Agricultural, Biological and Environmental Statistics 23, no. 2 (2018): 175-189. doi: 10.1007/s13253-018-0320-2.