Affiliations: Methodology and Data Science Division, Australian Bureau of Statistics, Level 3, 818 Bourke Street, Docklands, VIC 3000, Australia | E-mail: ryan.covey@abs.gov.au
Correspondence:
[*]
Corresponding author: Methodology and Data Science Division, Australian Bureau of Statistics, Level 3, 818 Bourke Street, Docklands, VIC 3000, Australia. E-mail: ryan.covey@abs.gov.au.
Note: [1] Views expressed in this paper are those of the author and do not necessarily represent those of the Australian Bureau of Statistics. Where quoted or used, they should be attributed clearly to the author.
Abstract: An ever-increasing deluge of big data is becoming available to national statistical offices globally, but it is well documented that statistics produced by big data alone often suffer from selection bias and are not usually representative of the population at large. In this paper, we construct a new design-based estimator of the median by integrating big data and survey data. Our estimator is asymptotically unbiased and has a smaller variance than a median estimator produced using survey data alone.
Keywords: Big data, data integration, probability sampling, survey design, quantile estimation