Affiliations: Hungarian Central Statistical Office, Corvinus University of Budapest, 1093 Budapest, Fövám sq. 8, Hungary. E-mail: antal.ertl@uni-corvinus.hu.
Correspondence:
[*]
Corresponding author: Hungarian Central Statistical Office, Corvinus University of Budapest, 1093 Budapest, Fövám sq. 8, Hungary. E-mail: antal.ertl@uni-corvinus.hu.
Abstract: In the following paper, we contribute to the research on outlier handling, concentrating on economic statistical data, namely observations in housing statistics. In order to create indices for changes in price, data cleaning, as well as model-optimizing is required – and for both, identifying outlying observations is crucial. By applying various techniques, such as distance-based and density-based outlier detection methods, we highlight the importance of dealing with outliers and discuss the difficulties one might encounter. Housing statistics is a special case, as there is a high correlation between price and the area of the dwelling in question, but it still serves as a fine example of handling outliers in economic and transaction-data. We show that identifying outliers is a rather nuanced thing, where statisticians could benefit from using advanced algorithms – such as the Local Outlier Factor (LOF), or the Feature Bagging Outlier Detection (FBOD).
Keywords: Outlier-detection, housing statistics, official statistics