Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Issue title: Papers from EKAW 2018
Guest editors: Catherina Faron and Chiara Ghidini
Article type: Research Article
Authors: Idrissou, Ala; b; * | van Harmelen, Franka | van den Besselaar, Peterb
Affiliations: [a] Department of Computer Science, Vrije Universiteit Amsterdam, The Netherlands. E-mails: o.a.k.idrissou@vu.nl, frank.van.harmelen@vu.nl | [b] Department of Organization Sciences, Vrije Universiteit Amsterdam, The Netherlands. E-mail: p.a.a.vanden.besselaar@vu.nl
Correspondence: [*] Corresponding author. E-mail: o.a.k.idrissou@vu.nl.
Note: [1] This is an extended version, by invitation, of a paper accepted at the 21st International Conference on Knowledge Engineering and Knowledge Management (EKAW 2018) (In Knowledge Engineering and Knowledge Management (2018) 147–162 Springer).
Abstract: Matching entities between datasets is a crucial step for combining multiple datasets on the semantic web. A rich literature exists on different approaches to this entity resolution problem. However, much less work has been done on how to assess the quality of such entity links once they have been generated. Evaluation methods for link quality are typically limited to either comparison with a ground truth dataset (which is often not available), manual work (which is cumbersome and prone to error), or crowd sourcing (which is not always feasible, especially if expert knowledge is required). Furthermore, the problem of link evaluation is greatly exacerbated for links between more than two datasets, because the number of possible links grows rapidly with the number of datasets. In this paper, we propose a method to estimate the quality of entity links between multiple datasets. We exploit the fact that the links between entities from multiple datasets form a network, and we show how simple metrics on this network can reliably predict their quality. We verify our results in a large experimental study using six datasets from the domain of science, technology and innovation studies, for which we created a gold standard. This gold standard, available online, is an additional contribution of this paper. In addition, we evaluate our metric on a recently published gold standard to confirm our findings.
Keywords: Entity resolution, data integration, network metrics
DOI: 10.3233/SW-200410
Journal: Semantic Web, vol. 12, no. 1, pp. 21-40, 2021
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl