Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Młodak, Andrzeja; b; * | Pietrzak, Michała | Józefowski, Tomaszc; a
Affiliations: [a] Statistical Office in Poznań, Centre for Small Area Estimation, Poznań, Poland | [b] Inter-faculty Department of Mathematics and Statistics, Calisia University – Kalisz, Poland | [c] Poznań University of Economics and Business, Poznań, Poland
Correspondence: [*] Corresponding author: Andrzej Młodak, Statistical Office in Poznań, Branch in Kalisz, ul. Piwonicka 7–9, 62–800 Kalisz, Poland. E-mail: a.mlodak@stat.gov.pl.
Note: [1] The paper was presented during the 2021 joint UNECE/Eurostat Expert Meeting on Statistical Data Confidentiality on 1–3 December 2021 in Poznań, Poland, https://unece.org/statistics/events/SDC2021.
Abstract: One of the key problems associated with Statistical Disclosure Control is ensuring an optimal trade-off between minimizing the risk of unit identification and maximizing the utility of data to be disseminated (which means minimizing information loss due to the application of SDC methods). In practice, it is usually achieved by defining how much risk can be accepted for any given unit, and then doing the best to modify the data set so that the risk is below the preset threshold while maximising the utility. Moreover, variables from statistical surveys vary not only in terms of their measurement scale but also as regards the role they play in the SDC process. All these aspects should therefore be taken into account when one tries to find this trade-off. In the paper we present a way of assessing whether an optimal trade-off has been achieved. Two main aspects of measuring the risk of disclosure are discussed. The first one is internal risk, i.e. the risk of disclosing confidential information only on the basis on disseminated microdata after the application of SDC (i.e. no attempt of combining data with external information is made); the second one is external risk, when the user has access to an alternative data set containing information that can be linked with statistical data in order to identify a unit. We show that it is possible to measure external risk and information loss while accounting for the measurement scale of variables. In our empirical study we used data from an annual survey of accidents at work for 2017. We compared complex information loss and the risk of disclosure in the original data files and those subjected to SDC using methods implemented in the new working version of the sdcMicro R package. We present the underlying assumptions and results of the SDC process, highlighting the benefits and drawbacks of the tools used in the study, which was conducted in 2020 and 2021 in the Centre for Small Area Estimation at the Statistical Office in Poznań.
Keywords: Statistical disclosure control, risk of disclosure, information loss, survey of accidents at work
DOI: 10.3233/SJI-220936
Journal: Statistical Journal of the IAOS, vol. 38, no. 4, pp. 1503-1511, 2022
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl