Sampling techniques for big data analysis in finite population inference

dc.contributor.author Kim, Jae Kwang
dc.contributor.author Wang, Zhonglei
dc.contributor.department Statistics
dc.date 2018-08-29T20:34:27.000
dc.date.accessioned 2020-07-02T06:55:59Z
dc.date.available 2020-07-02T06:55:59Z
dc.date.issued 2018-01-29
dc.description.abstract <p>In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first method uses a version of inverse sampling by incorporating auxiliary infor- mation from external sources, and the second one borrows the idea of data integration by combining the big data sample with an independent proba- bility sample. Two simulation studies show that the proposed methods are unbiased and have better coverage rates than their alternatives. In addition, the proposed methods are easy to implement in practice.</p>
dc.description.comments <p>This is a manuscript that has been accepted for publication in <em>International Statistical Review: </em><a href="https://arxiv.org/abs/1801.09728" target="_blank">https://arxiv.org/abs/1801.09728</a>.</p>
dc.identifier archive/lib.dr.iastate.edu/stat_las_preprints/136/
dc.identifier.articleid 1134
dc.identifier.contextkey 12700070
dc.identifier.s3bucket isulib-bepress-aws-west
dc.identifier.submissionpath stat_las_preprints/136
dc.identifier.uri https://dr.lib.iastate.edu/handle/20.500.12876/90297
dc.language.iso en
dc.source.bitstream archive/lib.dr.iastate.edu/stat_las_preprints/136/2018_Kim_SamplingTechniques.pdf|||Fri Jan 14 19:56:45 UTC 2022
dc.subject.disciplines Design of Experiments and Sample Surveys
dc.subject.disciplines Probability
dc.subject.disciplines Statistical Methodology
dc.subject.keywords Data integration
dc.subject.keywords inverse sampling
dc.subject.keywords non-probability sample
dc.subject.keywords selection bias
dc.title Sampling techniques for big data analysis in finite population inference
dc.type article
dc.type.genre article
dspace.entity.type Publication
relation.isAuthorOfPublication fdf914ae-e48d-4f4e-bfa2-df7a755320f4
relation.isOrgUnitOfPublication 264904d9-9e66-4169-8e11-034e537ddbca
File
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
2018_Kim_SamplingTechniques.pdf
Size:
183.93 KB
Format:
Adobe Portable Document Format
Description:
Collections