Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Purchase individual online access for 1 year to this journal.
Price: EUR 170.00Impact Factor 2023: 3
The journal Semantic Web – Interoperability, Usability, Applicability is an international and interdisciplinary journal bringing together researchers from various fields which share the vision and need for more effective and meaningful ways to share information across agents and services on the future Internet and elsewhere.
As such, Semantic Web technologies shall support the seamless integration of data, on-the-fly composition and interoperation of Web services, as well as more intuitive search engines. The semantics – or meaning – of information, however, cannot be defined without a context, which makes personalization, trust and provenance core topics for Semantic Web research.
New retrieval paradigms, user interfaces and visualization techniques have to unleash the power of the Semantic Web and at the same time hide its complexity from the user. Based on this vision, the journal welcomes contributions ranging from theoretical and foundational research over methods and tools to descriptions of concrete ontologies and applications in all areas. Papers which add a social, spatial and temporal dimension to Semantic Web research, as well as application-oriented papers making use of formal semantics, are especially welcome.
The journal is co-published by the Akademische Verlagsgesellschaft AKA.
Authors: Liartis, Jason | Dervakos, Edmund | Menis-Mastromichalakis, Orfeas | Chortaras, Alexandros | Stamou, Giorgos
Article Type: Research Article
Abstract: Deep learning models have achieved impressive performance in various tasks, but they are usually opaque with regards to their inner complex operation, obfuscating the reasons for which they make decisions. This opacity raises ethical and legal concerns regarding the real-life use of such models, especially in critical domains such as in medicine, and has led to the emergence of the eXplainable Artificial Intelligence (XAI) field of research, which aims to make the operation of opaque AI systems more comprehensible to humans. The problem of explaining a black-box classifier is often approached by feeding it data and observing its behaviour. In …this work, we feed the classifier with data that are part of a knowledge graph, and describe the behaviour with rules that are expressed in the terminology of the knowledge graph, that is understandable by humans. We first theoretically investigate the problem to provide guarantees for the extracted rules and then we investigate the relation of “explanation rules for a specific class” with “semantic queries collecting from the knowledge graph the instances classified by the black-box classifier to this specific class”. Thus we approach the problem of extracting explanation rules as a semantic query reverse engineering problem. We develop algorithms for solving this inverse problem as a heuristic search in the space of semantic queries and we evaluate the proposed algorithms on four simulated use-cases and discuss the results. Show more
Keywords: Explainable AI (XAI), opaque machine learning classifiers, knowledge graphs, description logics, semantic query answering, reverse query answering, post-hoc explainability, explanation rules
DOI: 10.3233/SW-233469
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-42, 2023
Authors: Adamski, Dariusz Max | Potoniec, Jędrzej
Article Type: Research Article
Abstract: We present a novel approach for learning embeddings of ALC knowledge base concepts. The embeddings reflect the semantics of the concepts in such a way that it is possible to compute an embedding of a complex concept from the embeddings of its parts by using appropriate neural constructors. Embeddings for different knowledge bases are vectors in a shared vector space, shaped in such a way that approximate subsumption checking for arbitrarily complex concepts can be done by the same neural network, called a reasoner head, for all the knowledge bases. To underline this unique property of enabling …reasoning directly on embeddings, we call them reason-able embeddings. We report the results of experimental evaluation showing that the difference in reasoning performance between training a separate reasoner head for each ontology and using a shared reasoner head, is negligible. Show more
Keywords: Neural-symbolic integration, deep deductive reasoning, embeddings, transfer learning, deep learning
DOI: 10.3233/SW-233355
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-33, 2023
Authors: Cima, Gianluca | Croce, Federico | Lenzerini, Maurizio
Article Type: Research Article
Abstract: Given two datasets, i.e., two sets of tuples of constants, representing positive and negative examples, logical separability is the reasoning task of finding a formula in a certain target query language that separates them. As already pointed out in previous works, this task turns out to be relevant in several application scenarios such as concept learning and generating referring expressions. Besides, if we think of the input datasets of positive and negative examples as composed of tuples of constants classified, respectively, positively and negatively by a black-box model, then the separating formula can be used to provide global post-hoc explanations …of such a model. In this paper, we study the separability task in the context of Ontology-based Data Management (OBDM), in which a domain ontology provides a high-level, logic-based specification of a domain of interest, semantically linked through suitable mapping assertions to the data source layer of an information system. Since a formula that properly separates (proper separation) two input datasets does not always exist, our first contribution is to propose (best) approximations of the proper separation, called (minimally) complete and (maximally) sound separations. We do this by presenting a general framework for separability in OBDM. Then, in a scenario that uses by far the most popular languages for the OBDM paradigm, our second contribution is a comprehensive study of three natural computational problems associated with the framework, namely Verification (check whether a given formula is a proper, complete, or sound separation of two given datasets), Existence (check whether a proper, or best approximated separation of two given datasets exists at all), and Computation (compute any proper, or any best approximated separation of two given datasets). Show more
Keywords: Ontology-based Data Management, Separability, Explainable Artificial Intelligence, Semantic Technologies
DOI: 10.3233/SW-233391
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-36, 2023
Authors: Rivas, Ariam | Collarana, Diego | Torrente, Maria | Vidal, Maria-Esther
Article Type: Research Article
Abstract: Neuro-Symbolic Artificial Intelligence (AI) focuses on integrating symbolic and sub-symbolic systems to enhance the performance and explainability of predictive models. Symbolic and sub-symbolic approaches differ fundamentally in how they represent data and make use of data features to reach conclusions. Neuro-symbolic systems have recently received significant attention in the scientific community. However, despite efforts in neural-symbolic integration, symbolic processing can still be better exploited, mainly when these hybrid approaches are defined on top of knowledge graphs. This work is built on the statement that knowledge graphs can naturally represent the convergence between data and their contextual meaning (i.e., knowledge). We …propose a hybrid system that resorts to symbolic reasoning, expressed as a deductive database, to augment the contextual meaning of entities in a knowledge graph, thus, improving the performance of link prediction implemented using knowledge graph embedding (KGE) models. An entity context is defined as the ego network of the entity in a knowledge graph. Given a link prediction task, the proposed approach deduces new RDF triples in the ego networks of the entities corresponding to the heads and tails of the prediction task on the knowledge graph (KG). Since knowledge graphs may be incomplete and sparse, the facts deduced by the symbolic system not only reduce sparsity but also make explicit meaningful relations among the entities that compose an entity ego network. As a proof of concept, our approach is applied over a KG for lung cancer to predict treatment effectiveness. The empirical results put the deduction power of deductive databases into perspective. They indicate that making explicit deduced relationships in the ego networks empowers all the studied KGE models to generate more accurate links. Show more
Keywords: Neuro-symbolic artificial intelligence, deductive systems, knowledge graph embeddings, drug-drug interactions
DOI: 10.3233/SW-233324
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2023
Authors: Badenes-Olmedo, Carlos | Corcho, Oscar
Article Type: Research Article
Abstract: There are two main limitations in most of the existing Knowledge Graph Question Answering (KGQA) algorithms. First, the approaches depend heavily on the structure and cannot be easily adapted to other KGs. Second, the availability and amount of additional domain-specific data in structured or unstructured formats has also proven to be critical in many of these systems. Such dependencies limit the applicability of KGQA systems and make their adoption difficult. A novel algorithm is proposed, MuHeQA, that alleviates both limitations by retrieving the answer from textual content automatically generated from KGs instead of queries over them. This new approach (1) works …on one or several KGs simultaneously, (2) does not require training data what makes it is domain-independent, (3) enables the combination of knowledge graphs with unstructured information sources to build the answer, and (4) reduces the dependency on the underlying schema since it does not navigate through structured content but only reads property values. MuHeQA extracts answers from textual summaries created by combining information related to the question from multiple knowledge bases, be them structured or not. Experiments over Wikidata and DBpedia show that our approach achieves comparable performance to other approaches in single-fact questions while being domain and KG independent. Results raise important questions for future work about how the textual content that can be created from knowledge graphs enables answer extraction. Show more
Keywords: Question answering, natural language processing, knowledge graphs
DOI: 10.3233/SW-233379
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-15, 2023
Authors: Stüber, Moritz | Frey, Georg
Article Type: Research Article
Abstract: Modelling and Simulation (M&S) are core tools for designing, analysing and operating today’s industrial systems. They often also represent both a valuable asset and a significant investment. Typically, their use is constrained to a software environment intended to be used by engineers on a single computer. However, the knowledge relevant to a task involving modelling and simulation is in general distributed in nature, even across organizational boundaries, and may be large in volume. Therefore, it is desirable to increase the FAIRness (Findability, Accessibility, Interoperability, and Reuse) of M&S capabilities; to enable their use in loosely coupled systems of systems; and …to support their composition and execution by intelligent software agents. In this contribution, the suitability of Semantic Web technologies to achieve these goals is investigated and an open-source proof of concept-implementation based on the Functional Mock-up Interface (FMI) standard is presented. Specifically, models, model instances, and simulation results are exposed through a hypermedia API and an implementation of the Pragmatic Proof Algorithm (PPA) is used to successfully demonstrate the API’s use by a generic software agent. The solution shows an increased degree of FAIRness and fully supports its use in loosely coupled systems. The FAIRness could be further improved by providing more “ rich” (meta)data. Show more
Keywords: Models and Simulation as a Service, FMI, hypermedia API, Pragmatic Proof Algorithm, FAIR principles
DOI: 10.3233/SW-233359
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-36, 2023
Authors: Flores, Javier | Rabbani, Kashif | Nadal, Sergi | Gómez, Cristina | Romero, Oscar | Jamin, Emmanuel | Dasiopoulou, Stamatia
Article Type: Research Article
Abstract: Virtual data integration is the current approach to go for data wrangling in data-driven decision-making. In this paper, we focus on automating schema integration, which extracts a homogenised representation of the data source schemata and integrates them into a global schema to enable virtual data integration. Schema integration requires a set of well-known constructs: the data source schemata and wrappers, a global integrated schema and the mappings between them. Based on them, virtual data integration systems enable fast and on-demand data exploration via query rewriting. Unfortunately, the generation of such constructs is currently performed in a largely manual manner, hindering …its feasibility in real scenarios. This becomes aggravated when dealing with heterogeneous and evolving data sources. To overcome these issues, we propose a fully-fledged semi-automatic and incremental approach grounded on knowledge graphs to generate the required schema integration constructs in four main steps: bootstrapping, schema matching, schema integration, and generation of system-specific constructs. We also present Nextia DI , a tool implementing our approach. Finally, a comprehensive evaluation is presented to scrutinize our approach. Show more
Keywords: Schema integration, bootstrapping, virtual data integration
DOI: 10.3233/SW-233347
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-38, 2023
Authors: Umbrico, Alessandro | Cesta, Amedeo | Orlandini, Andrea
Article Type: Research Article
Abstract: The diffusion of Human-Robot Collaborative cells is prevented by several barriers. Classical control approaches seem not yet fully suitable for facing the variability conveyed by the presence of human operators beside robots. The capabilities of representing heterogeneous knowledge representation and performing abstract reasoning are crucial to enhance the flexibility of control solutions. To this aim, the ontology SOHO (Sharework Ontology for Human-Robot Collaboration) has been specifically designed for representing Human-Robot Collaboration scenarios, following a context-based approach. This work brings several contributions. This paper proposes an extension of SOHO to better characterize behavioral constraints of collaborative tasks. Furthermore, this work shows …a knowledge extraction procedure designed to automatize the synthesis of Artificial Intelligence plan-based controllers for realizing flexible coordination of human and robot behaviors in collaborative tasks. The generality of the ontological model and the developed representation capabilities as well as the validity of the synthesized planning domains are evaluated on a number of realistic industrial scenarios where collaborative robots are actually deployed. Show more
Keywords: Ontology, knowledge representation and reasoning, Human-Robot Collaboration, automated planning and scheduling, Artificial Intelligence
DOI: 10.3233/SW-233394
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-40, 2023
Authors: Bareedu, Yashoda Saisree | Frühwirth, Thomas | Niedermeier, Christoph | Sabou, Marta | Steindl, Gernot | Thuluva, Aparna Saisree | Tsaneva, Stefani | Tufek Ozkaya, Nilay
Article Type: Research Article
Abstract: Industrial standards provide guidelines for data modeling to ensure interoperability between stakeholders of an industry branch (e.g., robotics). Most frequently, such guidelines are provided in an unstructured format (e.g., pdf documents) which hampers the automated validations of information objects (e.g., data models) that rely on such standards in terms of their compliance with the modeling constraints prescribed by the guidelines. This raises the risk of costly interoperability errors induced by the incorrect use of the standards. There is, therefore, an increased interest in automatic semantic validation of information objects based on industrial standards. In this paper we focus on an …approach to semantic validation by formally representing the modeling constraints from unstructured documents as explicit, machine-actionable rules (to be then used for semantic validation) and (semi-)automatically extracting such rules from pdf documents. While our approach aims to be generically applicable, we exemplify an adaptation of the approach in the concrete context of the OPC UA industrial standard, given its large-scale adoption among important industrial stakeholders and the OPC UA internal efforts towards semantic validation. We conclude that (i) it is feasible to represent modeling constraints from the standard specifications as rules, which can be organized in a taxonomy and represented using Semantic Web technologies such as OWL and SPARQL; (ii) we could automatically identify modeling constraints in the specification documents by inspecting the tables (P = 87 % ) and text of these documents (F1 up to 94%); (iii) the translation of the modeling constraints into formal rules could be fully automated when constraints were extracted from tables and required a Human-in-the-loop approach for constraints extracted from text. Show more
Keywords: Semantic validation, information extraction, natural language processing, human-in-the-loop, OPC UA
DOI: 10.3233/SW-233342
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-38, 2023
Authors: Yahya, Muhammad | Ali, Aabid | Mehmood, Qaiser | Yang, Lan | Breslin, John G. | Ali, Muhammad Intizar
Article Type: Research Article
Abstract: Industry 4.0 (I4.0) is a new era in the industrial revolution that emphasizes machine connectivity, automation, and data analytics. The I4.0 pillars such as autonomous robots, cloud computing, horizontal and vertical system integration, and the industrial internet of things have increased the performance and efficiency of production lines in the manufacturing industry. Over the past years, efforts have been made to propose semantic models to represent the manufacturing domain knowledge, one such model is Reference Generalized Ontological Model (RGOM).1 1 https://w3id.org/rgom However, its adaptability like other models is not ensured due to the lack of manufacturing …data. In this paper, we aim to develop a benchmark dataset for knowledge graph generation in Industry 4.0 production lines and to show the benefits of using ontologies and semantic annotations of data to showcase how the I4.0 industry can benefit from KGs and semantic datasets. This work is the result of collaboration with the production line managers, supervisors, and engineers in the football industry to acquire realistic production line data2 2 https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football , .3 3 https://zenodo.org/record/7779522 Knowledge Graphs (KGs) or Knowledge Graph (KG) have emerged as a significant technology to store the semantics of the domain entities. KGs have been used in a variety of industries, including banking, the automobile industry, oil and gas, pharmaceutical and health care, publishing, media, etc. The data is mapped and populated to the RGOM classes and relationships using an automated solution based on JenaAPI, producing an I4.0 KG. It contains more than 2.5 million axioms and about 1 million instances. This KG enables us to demonstrate the adaptability and usefulness of the RGOM. Our research helps the production line staff to take timely decisions by exploiting the information embedded in the KG. In relation to this, the RGOM adaptability is demonstrated with the help of a use case scenario to discover required information such as current temperature at a particular time, the status of the motor, tools deployed on the machine, etc. https://w3id.org/rgom https://github.com/MuhammadYahta/ManufacturingProductionLineDataSetGeneration-Football https://zenodo.org/record/7779522 Show more
Keywords: Industry 4.0, production line, Knowledge Graphs, Industry 4.0 Knowledge Graph
DOI: 10.3233/SW-233431
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-19, 2023
Authors: Teze, Juan Carlos L. | Paredes, Jose Nicolas | Martinez, Maria Vanina | Simari, Gerardo Ignacio
Article Type: Research Article
Abstract: The role of explanations in intelligent systems has in the last few years entered the spotlight as AI-based solutions appear in an ever-growing set of applications. Though data-driven (or machine learning) techniques are often used as examples of how opaque (also called black box) approaches can lead to problems such as bias and general lack of explainability and interpretability, in reality these features are difficult to tame in general, even for approaches that are based on tools typically considered to be more amenable, like knowledge-based formalisms. In this paper, we continue a line of research and development towards building tools …that facilitate the implementation of explainable and interpretable hybrid intelligent socio-technical systems, focusing on features that users can leverage to build explanations to their queries. In particular, we present the implementation of a recently-proposed application framework (and make available its source code) for developing such systems, and explore user-centered mechanisms for building explanations based both on the kinds of explanations required (such as counterfactual, contextual, etc.) and the inputs used for building them (coming from various sources, such as the knowledge base and lower-level data-driven modules). In order to validate our approach, we develop two use cases, one as a running example for detecting hate speech in social platforms and the other as an extension that also contemplates cyberbullying scenarios. Show more
Keywords: Ontological languages, socio-technical systems, Explainable Artificial Intelligence, hate speech in social platforms
DOI: 10.3233/SW-233297
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-30, 2023
Authors: Faria, Daniel | Santos, Emanuel | Balasubramani, Booma Sowkarthiga | Silva, Marta C. | Couto, Francisco M. | Pesquita, Catia
Article Type: Research Article
Abstract: Ontology matching establishes correspondences between entities of related ontologies, with applications ranging from enabling semantic interoperability to supporting ontology and knowledge graph development. Its demand within the Semantic Web community is on the rise, as the popularity of knowledge graph supporting information systems or artificial intelligence applications continues to increase. In this article, we showcase AgreementMakerLight (AML), an ontology matching system in continuous development since 2013, with demonstrated performance over nine editions of the Ontology Alignment Evaluation Initiative (OAEI), and a history of real-world applications across a variety of domains. We overview AML’s architecture and algorithms, its user interfaces …and functionalities, its performance, and its impact. AML has participated in more OAEI tracks since 2013 than any other matching system, has a median rank by F-measure between 1 and 2 across all tracks in every year since 2014, and a rank by run time between 3 and 4. Thus, it offers a combination of range, quality and efficiency that few matching systems can rival. Moreover, AML’s impact can be gauged by the 263 (non-self) publications that cite one or more of its papers, among which we count 34 real-world applications. Show more
Keywords: Ontology matching, instance matching, tool
DOI: 10.3233/SW-233304
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-13, 2023
Authors: Brewster, Christopher | Kalatzis, Nikos | Nouwt, Barry | Kruiger, Han | Verhoosel, Jack
Article Type: Research Article
Abstract: The agrifood system faces a great many economic, social and environmental challenges. One of the biggest practical challenges has been to achieve greater data sharing throughout the agrifood system and the supply chain, both to inform other stakeholders about a product and equally to incentivise greater environmental sustainability. In this paper, a data sharing architecture is described built on three principles (a) reuse of existing semantic standards; (b) integration with legacy systems; and (c) a distributed architecture where stakeholders control access to their own data. The system has been developed based on the requirements of commercial users and is designed …to allow queries across a federated network of agrifood stakeholders. The Ploutos semantic model is built on an integration of existing ontologies. The Ploutos architecture is built on a discovery directory and interoperability enablers, which use graph query patterns to traverse the network and collect the requisite data to be shared. The system is exemplified in the context of a pilot involving commercial stakeholders in the processed fruit sector. The data sharing approach is highly extensible with considerable potential for capturing sustainability related data. Show more
Keywords: Data sharing, supply chain, agrifood, graph pattern, ontology, Farm Management Systems
DOI: 10.3233/SW-233287
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-31, 2023
Authors: Păis,, Vasile | Mitrofan, Maria | Gasan, Carol Luca | Ianov, Alexandru | Ghit,ă, Corvin | Coneschi, Vlad Silviu | Onut,, Andrei
Article Type: Research Article
Abstract: LegalNERo is a manually annotated corpus for named entity recognition in the Romanian legal domain. It provides gold annotations for organizations, locations, persons, time expressions and legal resources mentioned in legal documents. Furthermore, GeoNames identifiers are provided. The resource is available in multiple formats, including span-based, token-based and RDF. The Linked Open Data version is available for both download and querying using SPARQL.
Keywords: Named entity recognition, linguistic linked data, Romanian language, corpus
DOI: 10.3233/SW-233351
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-14, 2023
Authors: Erenrich, Daniel
Article Type: Research Article
Abstract: Despite its size, Wikidata remains incomplete and inaccurate in many areas. Hundreds of thousands of articles on English Wikipedia have zero or limited meaningful structure on Wikidata. Much work has been done in the literature to partially or fully automate the process of completing knowledge graphs, but little of it has been practically applied to Wikidata. This paper presents two interconnected practical approaches to speeding up the Wikidata completion task. The first is Wwwyzzerdd, a browser extension that allows users to quickly import statements from Wikipedia to Wikidata. Wwwyzzerdd has been used to make over 100 thousand edits to Wikidata. …The second is Psychiq, a new model for predicting instance and subclass statements based on English Wikipedia articles. Psychiq’s performance and characteristics make it well suited to solving a variety of problems for the Wikidata community. One initial use is integrating the Psychiq model into the Wwwyzzerdd browser extension. Show more
Keywords: Wikidata, Wikipedia, browser extension, knowledge graph completion
DOI: 10.3233/SW-233450
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-14, 2023
Authors: Amaral, Gabriel | Rodrigues, Odinaldo | Simperl, Elena
Article Type: Research Article
Abstract: Knowledge Graphs are repositories of information that gather data from a multitude of domains and sources in the form of semantic triples, serving as a source of structured data for various crucial applications in the modern web landscape, from Wikipedia infoboxes to search engines. Such graphs mainly serve as secondary sources of information and depend on well-documented and verifiable provenance to ensure their trustworthiness and usability. However, their ability to systematically assess and assure the quality of this provenance, most crucially whether it properly supports the graph’s information, relies mainly on manual processes that do not scale with size. ProVe …aims at remedying this, consisting of a pipelined approach that automatically verifies whether a Knowledge Graph triple is supported by text extracted from its documented provenance. ProVe is intended to assist information curators and consists of four main steps involving rule-based methods and machine learning models: text extraction, triple verbalisation, sentence selection, and claim verification. ProVe is evaluated on a Wikidata dataset, achieving promising results overall and excellent performance on the binary classification task of detecting support from provenance, with 87.5 % accuracy and 82.9 % F1-macro on text-rich sources. The evaluation data and scripts used in this paper are available in GitHub and Figshare. Show more
Keywords: Fact verification, data verbalisation, knowledge graphs
DOI: 10.3233/SW-233467
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-34, 2023
Authors: Yaman, Beyza | Thompson, Kevin | Fahey, Fergus | Brennan, Rob
Article Type: Research Article
Abstract: This work describes the application of semantic web standards to data quality governance of data production pipelines in the architectural, engineering, and construction (AEC) domain for Ordnance Survey Ireland (OSi). It illustrates a new approach to data quality governance based on establishing a unified knowledge graph for data quality measurements across a complex, heterogeneous, quality-centric data production pipeline. It provides the first comprehensive formal mappings between semantic models of data quality dimensions defined by the four International Organization for Standardization (ISO) and World Wide Web Consortium (W3C) data quality standards applied by different tools and stakeholders. It provides an approach …to uplift rule-based data quality reports into quality metrics suitable for aggregation and end-to-end analysis. Current industrial practice tends towards stove-piped, vendor-specific and domain-dependent tools to process data quality observations however there is a lack of open techniques and methodologies for combining quality measurements derived from different data quality standards to provide end-to-end data quality reporting, root cause analysis or visualisation. This work demonstrated that it is effective to use a knowledge graph and semantic web standards to unify distributed data quality monitoring in an organisation and present the results in an end-to-end data dashboard in a data quality standards-agnostic fashion for the Ordnance Survey Ireland data publishing pipeline. Show more
Keywords: Geospatial Linked Data, data quality, data governance
DOI: 10.3233/SW-233293
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Zeginis, Dimitris | Kalampokis, Evangelos | Palma, Raul | Atkinson, Rob | Tarabanis, Konstantinos
Article Type: Research Article
Abstract: At the domains of agriculture and livestock farming a large amount of data are produced through numerous heterogeneous sources including sensor data, weather/climate data, statistical and government data, drone/satellite imagery, video, and maps. This plethora of data can be used at precision agriculture and precision livestock farming in order to provide predictive insights in farming operations, drive real-time operational decisions, redesign business processes and support policy-making. The predictive power of the data can be further boosted if data from diverse sources are integrated and processed together, thus providing more unexplored insights. However, the exploitation and integration of data used in …precision agriculture is not straightforward since they: i) cannot be easily discovered across the numerous heterogeneous sources and ii) use different structural and naming conventions hindering their interoperability. The aim of this paper is to: i) study the characteristics of data used in precision agriculture & livestock farming and ii) study the user requirements related to data modeling and processing from nine real cases at the agriculture, livestock farming and aquaculture domains and iii) propose a semantic meta-model that is based on W3C standards (DCAT, PROV-O and QB vocabulary) in order to enable the definition of metadata that facilitate the discovery, exploration, integration and accessing of data in the domain. Show more
Keywords: Semantic model, metadata, data integration, precision agriculture, precision livestock farming, DCAT
DOI: 10.3233/SW-233156
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2023
Authors: Thornton, Katherine | Seals-Nutt, Kenneth | Matsuzaki, Mika | Dooley, Damion
Article Type: Research Article
Abstract: We describe our work to integrate the FoodOn ontology with our knowledge base of food composition data, WikiFCD. WikiFCD is knowledge base of structured data related to food composition and food items. With a goal to reuse FoodOn identifiers for food items, we imported a subset of the FoodOn ontology into the WikiFCD knowledge base. We aligned the import via a shared use of NCBI taxon identifiers for the taxon names of the plants from which the food items are derived. Reusing FoodOn benefits WikiFCD by allowing us to leverage the food item groupings that FoodOn contains. This integration also …has potential future benefits for the FoodOn community due to the fact that WikiFCD provides food composition data at the food item level, and that WikiFCD is mapped to Wikidata and contains a SPARQL endpoint that supports federated queries. Federated queries across WikiFCD and Wikidata allow us to ask questions about food items that benefit from the cross-domain information of Wikidata, greatly increasing the breadth of possible data combinations. Show more
Keywords: Food composition data, Wikibase, FoodOn
DOI: 10.3233/SW-233207
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-12, 2023
Authors: Woods, Caitlin | Selway, Matt | Bikaun, Tyler | Stumptner, Markus | Hodkiewicz, Melinda
Article Type: Research Article
Abstract: Maintenance of assets is a multi-million dollar cost each year for asset intensive organisations in the defence, manufacturing, resource and infrastructure sectors. These costs are tracked though maintenance work order (MWO) records. MWO records contain structured data for dates, costs, and asset identification and unstructured text describing the work required, for example ‘replace leaking pump’. Our focus in this paper is on data quality for maintenance activity terms in MWO records (e.g. replace , repair , adjust and inspect ). We present two contributions in this paper. First, we propose a reference ontology for maintenance activity terms. We …use natural language processing to identify seven core maintenance activity terms and their synonyms from 800,000 MWOs. We provide elucidations for these seven terms. Second, we demonstrate use of the reference ontology in an application-level ontology using an industrial use case. The end-to-end NLP-ontology pipeline identifies data quality issues with 55% of the MWO records for a centrifugal pump over 8 years. For the 33% of records where a verb was not provided in the unstructured text, the ontology can infer a relevant activity class. The selection of the maintenance activity terms is informed by the ISO 14224 and ISO 15926-4 standards and conforms to ISO/IEC 21838-2 Basic Formal Ontology (BFO). The reference and application ontologies presented here provide an example for how industrial organisations can augment their maintenance work management processes with ontological workflows to improve data quality. Show more
Keywords: Maintenance work order, ontology, natural language processing, centrifugal pump
DOI: 10.3233/SW-233299
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-34, 2023
Authors: Li, Juan | Chen, Xiangnan | Yu, Hongtao | Chen, Jiaoyan | Zhang, Wen
Article Type: Research Article
Abstract: Knowledge graph reasoning (KGR) aims to infer new knowledge or detect noises, which is essential for improving the quality of knowledge graphs. Recently, various KGR techniques, such as symbolic- and embedding-based methods, have been proposed and shown strong reasoning ability. Symbolic-based reasoning methods infer missing triples according to predefined rules or ontologies. Although rules and axioms have proven effective, it is difficult to obtain them. Embedding-based reasoning methods represent entities and relations as vectors, and complete KGs via vector computation. However, they mainly rely on structural information and ignore implicit axiom information not predefined in KGs but can be reflected …in data. That is, each correct triple is also a logically consistent triple and satisfies all axioms. In this paper, we propose a novel NeuR al A xiom N etwork (NeuRAN ) framework that combines explicit structural and implicit axiom information without introducing additional ontologies. Specifically, the framework consists of a KG embedding module that preserves the semantics of triples and five axiom modules that encode five kinds of implicit axioms. These axioms correspond to five typical object property expression axioms defined in OWL2, including ObjectPropertyDomain , ObjectPropertyRange , DisjointObjectProperties , IrreflexiveObjectProperty and AsymmetricObjectProperty . The KG embedding module and axiom modules compute the scores that the triple conforms to the semantics and the corresponding axioms, respectively. Compared with KG embedding models and CKRL, our method achieves comparable performance on noise detection and triple classification and achieves significant performance on link prediction. Compared with TransE and TransH, our method improves the link prediction performance on the Hits@1 metric by 22.0% and 20.8% on WN18RR-10% dataset, respectively. Show more
Keywords: Knowledge graph reasoning, knowledge graph embedding, noise detection, triple classification, link prediction
DOI: 10.3233/SW-233276
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-16, 2023
Authors: Chari, Shruthi | Seneviratne, Oshani | Ghalwash, Mohamed | Shirai, Sola | Gruen, Daniel M. | Meyer, Pablo | Chakraborty, Prithwish | McGuinness, Deborah L.
Article Type: Research Article
Abstract: In the past decade, trustworthy Artificial Intelligence (AI) has emerged as a focus for the AI community to ensure better adoption of AI models, and explainable AI is a cornerstone in this area. Over the years, the focus has shifted from building transparent AI methods to making recommendations on how to make black-box or opaque machine learning models and their results more understandable by experts and non-expert users. In our previous work, to address the goal of supporting user-centered explanations that make model recommendations more explainable, we developed an Explanation Ontology (EO). The EO is a general-purpose representation that was …designed to help system designers connect explanations to their underlying data and knowledge. This paper addresses the apparent need for improved interoperability to support a wider range of use cases. We expand the EO, mainly in the system attributes contributing to explanations, by introducing new classes and properties to support a broader range of state-of-the-art explainer models. We present the expanded ontology model, highlighting the classes and properties that are important to model a larger set of fifteen literature-backed explanation types that are supported within the expanded EO. We build on these explanation type descriptions to show how to utilize the EO model to represent explanations in five use cases spanning the domains of finance, food, and healthcare. We include competency questions that evaluate the EO’s capabilities to provide guidance for system designers on how to apply our ontology to their own use cases. This guidance includes allowing system designers to query the EO directly and providing them exemplar queries to explore content in the EO represented use cases. We have released this significantly expanded version of the Explanation Ontology at https://purl.org/heals/eo and updated our resource website, https://tetherless-world.github.io/explanation-ontology , with supporting documentation. Overall, through the EO model, we aim to help system designers be better informed about explanations and support these explanations that can be composed, given their systems’ outputs from various AI models, including a mix of machine learning, logical and explainer models, and different types of data and knowledge available to their systems. Show more
Keywords: Explainable AI, semantic representation of explanations, Explanation Ontology, modeling explanation types – AI method outputs and knowledge, supporting patterns for explanation types
DOI: 10.3233/SW-233282
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-31, 2023
Authors: Glauer, Martin | Memariani, Adel | Neuhaus, Fabian | Mossakowski, Till | Hastings, Janna
Article Type: Research Article
Abstract: Reference ontologies provide a shared vocabulary and knowledge resource for their domain. Manual construction and annotation enables them to maintain high quality, allowing them to be widely accepted across their community. However, the manual ontology development process does not scale for large domains. We present a new methodology for automatic ontology extension for domains in which the ontology classes have associated graph-structured annotations, and apply it to the ChEBI ontology, a prominent reference ontology for life sciences chemistry. We train Transformer-based deep learning models on the leaf node structures from the ChEBI ontology and the classes to which they …belong. The models are then able to automatically classify previously unseen chemical structures, resulting in automated ontology extension. The proposed models achieved an overall F1 scores of 0.80 and above, improvements of at least 6 percentage points over our previous results on the same dataset. In addition, the models are interpretable: we illustrate that visualizing the model’s attention weights can help to explain the results by providing insight into how the model made its decisions. We also analyse the performance for molecules that have not been part of the ontology and evaluate the logical correctness of the resulting extension. Show more
Keywords: Ontology extension, ontology learning, chemical ontology, Transformers, automated classification, transfer learning, multi-label classification
DOI: 10.3233/SW-233183
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-22, 2023
Authors: Daga, Enrico | Groth, Paul
Article Type: Research Article
Abstract: Artificial intelligence systems are not simply built on a single dataset or trained model. Instead, they are made by complex data science workflows involving multiple datasets, models, preparation scripts, and algorithms. Given this complexity, in order to understand these AI systems, we need to provide explanations of their functioning at higher levels of abstraction. To tackle this problem, we focus on the extraction and representation of data journeys from these workflows. A data journey is a multi-layered semantic representation of data processing activity linked to data science code and assets. We propose an ontology to capture the essential elements …of a data journey and an approach to extract such data journeys. Using a corpus of Python notebooks from Kaggle, we show that we are able to capture high-level semantic data flow that is more compact than using the code structure itself. Furthermore, we show that introducing an intermediate knowledge graph representation outperforms models that rely only on the code itself. Finally, we report on a user survey to reflect on the challenges and opportunities presented by computational data journeys for explainable AI. Show more
Keywords: Data science analysis, XAI, transparency, explainability, data provenance, workflows
DOI: 10.3233/SW-233407
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Vega-Gorgojo, Guillermo
Article Type: Research Article
Abstract: LOD4Culture is a web application that exploits Cultural Heritage Linked Open Data for tourism and education purposes. Since target users are not fluid on Semantic Web technologies, the user interface is designed to hide the intricacies of RDF or SPARQL. An interactive map is provided for exploring world-wide Cultural Heritage sites that can be filtered by type and that uses cluster markers to adapt the view to different zoom levels. LOD4Culture also includes a Cultural Heritage entity browser that builds comprehensive visualizations of sites, artists, and artworks. All data exchanges are facilitated through the use of a generator of REST …APIs over Linked Open Data that translates API calls into SPARQL queries across multiple sources, including Wikidata and DBpedia. Since March 2022, more than 1.7K users have employed LOD4Culture. The application has been mentioned many times in social media and has been featured in the DBpedia Newsletter, in the list of Wikidata tools for visualizing data, and in the open data applications list of datos.gob.es . Show more
Keywords: Cultural Heritage, Linked Open Data, data access, REST API, map visualizations, user interfaces
DOI: 10.3233/SW-233358
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-30, 2023
Authors: Steenwinckel, Bram | De Turck, Filip | Ongenae, Femke
Article Type: Research Article
Abstract: Semantic rule mining can be used for both deriving task-agnostic or task-specific information within a Knowledge Graph (KG). Underlying logical inferences to summarise the KG or fully interpretable binary classifiers predicting future events are common results of such a rule mining process. The current methods to perform task-agnostic or task-specific semantic rule mining operate, however, a completely different KG representation, making them less suitable to perform both tasks or incorporate each other’s optimizations. This also results in the need to master multiple techniques for both exploring and mining rules within KGs, as well losing time and resources when converting one …KG format into another. In this paper, we use INK, a KG representation based on neighbourhood nodes of interest to mine rules for improved decision support. By selecting one or two sets of nodes of interest, the rule miner created on top of the INK representation will either mine task-agnostic or task-specific rules. In both subfields, the INK miner is competitive to the currently state-of-the-art semantic rule miners on 14 different benchmark datasets within multiple domains. Show more
Keywords: Knowledge representation, semantic rule mining
DOI: 10.3233/SW-233495
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-22, 2023
Authors: Lambrix, Patrick | Armiento, Rickard | Li, Huanyu | Hartig, Olaf | Abd Nikooie Pour, Mina | Li, Ying
Article Type: Research Article
Abstract: In the materials design domain, much of the data from materials calculations is stored in different heterogeneous databases with different data and access models. Therefore, accessing and integrating data from different sources is challenging. As ontology-based access and integration alleviates these issues, in this paper we address data access and interoperability for computational materials databases by developing the Materials Design Ontology. This ontology is inspired by and guided by the OPTIMADE effort that aims to make materials databases interoperable and includes many of the data providers in computational materials science. In this paper, first, we describe the development and the …content of the Materials Design Ontology. Then, we use a topic model-based approach to propose additional candidate concepts for the ontology. Finally, we show the use of the Materials Design Ontology by a proof-of-concept implementation of a data access and integration system for materials databases based on the ontology.1 1 This paper is an extension of (In The Semantic Web – ISWC 2020 – 19th International Semantic Web Conference, Proceedings, Part II (2000 ) 212–227 Springer) with results from (In ESWC Workshop on Domain Ontologies for Research Data Management in Industry Commons of Materials and Manufacturing 2021 1–11) and currently unpublished results regarding an application using the ontology. This paper is an extension of (In The Semantic Web – ISWC 2020 – 19th International Semantic Web Conference, Proceedings, Part II (2000 ) 212–227 Springer) with results from (In ESWC Workshop on Domain Ontologies for Research Data Management in Industry Commons of Materials and Manufacturing 2021 1–11) and currently unpublished results regarding an application using the ontology. Show more
Keywords: Ontology, ontology development, data access, data integration, materials science, Materials Design Ontology
DOI: 10.3233/SW-233340
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-35, 2023
Authors: Bushati, Geni | Rasmusen, Sven Carsten | Kurteva, Anelia | Vats, Anurag | Nako, Petraq | Fensel, Anna
Article Type: Research Article
Abstract: The General Data Protection Regulation (GDPR) has imposed strict requirements for data sharing, one of which is informed consent. A common way to request consent online is via cookies. However, commonly, users accept online cookies being unaware of the meaning of the given consent and the following implications. Once consent is given, the cookie “disappears”, and one forgets that consent was given in the first place. Retrieving cookies and consent logs becomes challenging, as most information is stored in the specific Internet browser’s logs. To make users aware of the data sharing implied by cookie consent and to support transparency and …traceability within systems, we present a knowledge graph (KG) based tool for personalised cookie consent information visualisation. The KG is based on the OntoCookie ontology, which models cookies in a machine-readable format and supports data interpretability across domains. Evaluation results confirm that the users’ comprehension of the data shared through cookies is vague and insufficient. Furthermore, our work has resulted in an increase of 47.5% in the users’ willingness to be cautious when viewing cookie banners before giving consent. These and other evaluation results confirm that our cookie data visualisation approach and tool help to increase users’ awareness of cookies and data sharing. Show more
Keywords: Cookies, consent, GDPR, ontology, knowledge graph, data sharing, comprehension
DOI: 10.3233/SW-233435
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-17, 2023
Authors: Giustozzi, Franco | Saunier, Julien | Zanni-Merk, Cecilia
Article Type: Research Article
Abstract: In Industry 4.0, factory assets and machines are equipped with sensors that collect data for effective condition monitoring. This is a difficult task since it requires the integration and processing of heterogeneous data from different sources, with different temporal resolutions and underlying meanings. Ontologies have emerged as a pertinent method to deal with data integration and to represent manufacturing knowledge in a machine-interpretable way through the construction of semantic models. Ontologies are used to structure knowledge in knowledge bases, which also contain instances and information about these data. Thus, a knowledge base provides a sort of virtual representation of the …different elements involved in a manufacturing process. Moreover, the monitoring of industrial processes depends on the dynamic context of their execution. Under these circumstances, the semantic model must provide a way to represent this evolution in order to represent in which situation(s) a resource is in during the execution of its tasks to support decision making. This paper proposes a semantic framework to address the evolution of knowledge bases for condition monitoring in Industry 4.0. To this end, firstly we propose a semantic model (the COInd4 ontology) for the manufacturing domain that represents the resources and processes that are part of a factory, with special emphasis on the context of these resources and processes. Relevant situations that combine sensor observations with domain knowledge are also represented in the model. Secondly, an approach that uses stream reasoning to detect these situations that lead to potential failures is introduced. This approach enriches data collected from sensors with contextual information using the proposed semantic model. The use of stream reasoning facilitates the integration of data from different data sources, different temporal resolutions as well as the processing of these data in real time. This allows to derive high-level situations from lower-level context and sensor information. Detecting situations can trigger actions to adapt the process behavior, and in turn, this change in behavior can lead to the generation of new contexts leading to new situations. These situations can have different levels of severity, and can be nested in different ways. Dealing with the rich relations among situations requires an efficient approach to organize them. Therefore, we propose a method to build a lattice, ordering those situations depending on the constraints they rely on. This lattice represents a road-map of all the situations that can be reached from a given one, normal or abnormal. This helps in decision support, by allowing the identification of the actions that can be taken to correct the abnormality avoiding in this way the interruption of the manufacturing processes. Finally, an industrial application scenario for the proposed approach is described. Show more
Keywords: Semantic technologies, ontology, context modeling, stream reasoning, condition monitoring, Industry 4.0
DOI: 10.3233/SW-233481
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2023
Authors: Donkers, Alex | de Vries, Bauke | Yang, Dujuan
Article Type: Research Article
Abstract: Occupant feedback enables building managers to improve occupants’ health, comfort, and satisfaction. However, acquiring continuous occupant feedback and integrating this feedback with other building information is challenging. This paper presents a scalable method to acquire continuous occupant feedback and directly integrate this with other building information. Semantic web technologies were applied to solve data interoperability issues. The Occupant Feedback Ontology was developed to describe feedback semantically. Next to this, a smartwatch app – Mintal – was developed to acquire continuous feedback on indoor environmental quality. The app gathers location, medical information, and answers on short micro surveys. Mintal applied the …Occupant Feedback Ontology to directly integrate the feedback with linked building data. A case study was performed to evaluate this method. A semantic digital twin was created by integrating linked building data, sensor data, and occupant feedback. Results from SPARQL queries gave more insight into an occupant’s perceived comfort levels in the Open Flat. The case study shows how integrating feedback with building information allows for more occupant-centric decision support tools. The approach presented in this paper can be used in a wide range of use cases, both within and without the architecture, building, and construction domain. Show more
Keywords: Digital twin, Occupant Feedback Ontology, smartwatch, semantic web, linked building data
DOI: 10.3233/SW-223254
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-26, 2022
Authors: De Giorgis, Stefano | Gangemi, Aldo | Gromann, Dagmar
Article Type: Research Article
Abstract: Commonsense knowledge is a broad and challenging area of research which investigates our understanding of the world as well as human assumptions about reality. Deriving directly from the subjective perception of the external world, it is intrinsically intertwined with embodied cognition. Commonsense reasoning is linked to human sense-making, pattern recognition and knowledge framing abilities. This work presents a new resource that formalizes the cognitive theory of image schemas. Image schemas are dynamic conceptual building blocks originating from our sensorimotor interactions with the physical world, and enable our sense-making cognitive activity to assign coherence and structure to entities, events and situations …we experience everyday. ImageSchemaNet is an ontology that aligns pre-existing resources, such as FrameNet, VerbNet, WordNet and MetaNet from the Framester hub, to image schema theory. This article describes an empirical application of ImageSchemaNet, combined with semantic parsers, on the task of annotating natural language sentences with image schemas. Show more
Keywords: Image schemas, cognitive semantics, frame semantics, commonsense reasoning
DOI: 10.3233/SW-223084
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2022
Authors: Dooley, Damion | Weber, Magalie | Ibanescu, Liliana | Lange, Matthew | Chan, Lauren | Soldatova, Larisa | Yang, Chen | Warren, Robert | Shimizu, Cogan | McGinty, Hande K. | Hsiao, William
Article Type: Research Article
Abstract: People often value the sensual, celebratory, and health aspects of food, but behind this experience exists many other value-laden agricultural production, distribution, manufacturing, and physiological processes that support or undermine a healthy population and a sustainable future. The complexity of such processes is evident in both every-day food preparation of recipes and in industrial food manufacturing, packaging and storage, each of which depends critically on human or machine agents, chemical or organismal ingredient references, and the explicit instructions and implicit procedures held in formulations or recipes. An integrated ontology landscape does not yet exist to cover all the entities at …work in this farm to fork journey. It seems necessary to construct such a vision by reusing expert-curated fit-to-purpose ontology subdomains and their relationship, material, and more abstract organization and role entities. The challenge is to make this merger be, by analogy, one language, rather than nouns and verbs from a dozen or more dialects which cannot be used directly in statements about some aspect of the farm to fork journey without expensive translation or substantial dialect education in order to understand a particular text or domain of knowledge. This work focuses on the ontology components – object and data properties and annotations – needed to model food processes or more general process modelling within the context of the Open Biological and Biomedical Ontology Foundry and congruent ontologies. Ideally these components can be brought together in a general process ontology that can be specialized not only for the food domain but for carrying out other protocols as well. Many operations involved in food identification, preparation, transportation and storage – shaking, boiling, mixing, freezing, labeling, shipping – are actually common to activities from manufacturing and laboratory work to local or home food preparation. Show more
Keywords: Ontology, food processing, recipe, process modelling, OBO Foundry
DOI: 10.3233/SW-223096
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-32, 2022
Authors: Spahiu, Blerina | Palmonari, Matteo | Alva Principe, Renzo Arturo | Rula, Anisa
Article Type: Research Article
Abstract: While there has been a trend in the last decades for publishing large-scale and highly-interconnected Knowledge Graphs (KGs), their users often get overwhelmed by the task of understanding their content as a result of their size and complexity. Data profiling approaches have been proposed to summarize large KGs into concise and meaningful representations, so that they can be better explored, processed, and managed. Profiles based on schema patterns represent each triple in a KG with its schema-level counterpart, thus covering the entire KG with profiles of considerable size. In this paper, we provide empirical evidence that profiles based on schema …patterns, if explored with suitable mechanisms, can be useful to help users understand the content of big and complex KGs. ABSTAT provides concise pattern-based profiles and comes with faceted interfaces for profile exploration. Using this tool we present a user study based on query completion tasks. We demonstrate that users who look at ABSTAT profiles formulate their queries better and faster than users browsing the ontology of the KGs. The latter is a pretty strong baseline considering that many KGs do not even come with a specific ontology to be explored by the users. To the best of our knowledge, this is the first attempt to investigate the impact of profiling techniques on tasks related to knowledge graph understanding with a user study. Show more
Keywords: Data understanding, data profiling, summarization, rdf, knowledge graph
DOI: 10.3233/SW-223181
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2023
Authors: Compagno, Francesco | Borgo, Stefano
Article Type: Research Article
Abstract: In both applied ontology and engineering, functionality is a well-researched topic, since it is through teleological causal reasoning that domain experts build mental models of engineering systems, giving birth to functions. These mental models are important throughout the whole lifecycle of any product, being used from the design phase up to diagnosis activities. Though a vast amount of work to model functions has already been carried out, the literature has not settled on a shared and well-defined approach due to the variety of concepts involved and the modeling tasks that functional descriptions should satisfy. The work in this paper posits …the basis and makes some crucial steps towards a rich ontological description of functions and related concepts, such as behaviour, capability, and capacity. A conceptual analysis of such notions is carried out using the top-level ontology DOLCE as a framework, and the ensuing logical theory is formally described in first-order logic and OWL, showing how ontological concepts can model major aspects of engineering products in applications. In particular, it is shown how functions can be distinguished from the implementation methods to realize them, how one can differentiate between capabilities and capacities of a product, and how these are related to engineering functions. Show more
Keywords: Ontology, function, behaviour, capability, DOLCE
DOI: 10.3233/SW-223188
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-34, 2023
Authors: Portisch, Jan | Hladik, Michael | Paulheim, Heiko
Article Type: Research Article
Abstract: Ontology matching is an integral part for establishing semantic interoperability. One of the main challenges within the ontology matching operation is semantic heterogeneity, i.e. modeling differences between the two ontologies that are to be integrated. The semantics within most ontologies or schemas are, however, typically incomplete because they are designed within a certain context which is not explicitly modeled. Therefore, external background knowledge plays a major role in the task of (semi-) automated ontology and schema matching. In this survey, we introduce the reader to the general ontology matching problem. We review the background knowledge sources as well as …the approaches applied to make use of external knowledge. Our survey covers all ontology matching systems that have been presented within the years 2004–2021 at a well-known ontology matching competition together with systematically selected publications in the research field. We present a classification system for external background knowledge, concept linking strategies, as well as for background knowledge exploitation approaches. We provide extensive examples and classify all ontology matching systems under review in a resource/strategy matrix obtained by coalescing the two classification systems. Lastly, we outline interesting and yet underexplored research directions of applying external knowledge within the ontology matching process. Show more
Keywords: Ontology matching, schema matching, background knowledge, data integration, semantic integration, knowledge graphs, ontologies
DOI: 10.3233/SW-223085
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-55, 2022
Authors: Nguyen, Phuc | Kertkeidkachorn, Natthawut | Ichise, Ryutaro | Takeda, Hideaki
Article Type: Research Article
Abstract: Semantic annotation of tabular data is the process of matching table elements with knowledge graphs. As a result, the table contents could be interpreted or inferred using knowledge graph concepts, enabling them to be useful in downstream applications such as data analytics and management. Nevertheless, semantic annotation tasks are challenging due to insufficient tabular data descriptions, heterogeneous schema, and vocabulary issues. This paper presents an automatic semantic annotation system for tabular data, called MTab4D, to generate annotations with DBpedia in three annotation tasks: 1) matching table cells to entities, 2) matching columns to entity types, and 3) matching pairs of …columns to properties. In particular, we propose an annotation pipeline that combines multiple matching signals from different table elements to address schema heterogeneity, data ambiguity, and noisiness. Additionally, this paper provides insightful analysis and extra resources on benchmarking semantic annotation with knowledge graphs. Experimental results on the original and adapted datasets of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab 2019) show that our system achieves an impressive performance for the three annotation tasks. MTab4D’s repository is publicly available at https://github.com/phucty/mtab4dbpedia . Show more
Keywords: Table annotation, knowledge graph, DBpedia, semantic table interpretation
DOI: 10.3233/SW-223098
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2022
Authors: Hamilton, Kyle | Nayak, Aparna | Božić, Bojan | Longo, Luca
Article Type: Research Article
Abstract: Advocates for Neuro-Symbolic Artificial Intelligence (NeSy) assert that combining deep learning with symbolic reasoning will lead to stronger AI than either paradigm on its own. As successful as deep learning has been, it is generally accepted that even our best deep learning systems are not very good at abstract reasoning. And since reasoning is inextricably linked to language, it makes intuitive sense that Natural Language Processing (NLP), would be a particularly well-suited candidate for NeSy. We conduct a structured review of studies implementing NeSy for NLP, with the aim of answering the question of whether NeSy is indeed meeting its …promises: reasoning, out-of-distribution generalization, interpretability, learning and reasoning from small data, and transferability to new domains. We examine the impact of knowledge representation, such as rules and semantic networks, language structure and relational structure, and whether implicit or explicit reasoning contributes to higher promise scores. We find that systems where logic is compiled into the neural network lead to the most NeSy goals being satisfied, while other factors such as knowledge representation, or type of neural architecture do not exhibit a clear correlation with goals being met. We find many discrepancies in how reasoning is defined, specifically in relation to human level reasoning, which impact decisions about model architectures and drive conclusions which are not always consistent across studies. Hence we advocate for a more methodical approach to the application of theories of human reasoning as well as the development of appropriate benchmarks, which we hope can lead to a better understanding of progress in the field. We make our data and code available on github for further analysis.1 1 https://github.com/kyleiwaniec/neuro-symbolic-ai-systematic-review https://github.com/kyleiwaniec/neuro-symbolic-ai-systematic-review Show more
Keywords: Neuro-symbolic artificial intelligence, natural language processing, deep learning, knowledge representation & reasoning, structured review
DOI: 10.3233/SW-223228
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-42, 2022
Authors: Pandit, Harshvardhan J. | Esteves, Beatriz
Article Type: Research Article
Abstract: The Global Alliance for Genomics and Health is an international consortium that is developing the Data Use Ontology (DUO) as a standard providing machine-readable codes for automation in data discovery and responsible sharing of genomics data. DUO concepts, which are encoded using OWL, only contain the textual descriptions of the conditions for data use they represent, and do not specify the intended permissions, prohibitions, and obligations explicitly – which limits their usefulness. We present an exploration of how the Open Digital Rights Language (ODRL) can be used to explicitly represent the information inherent in DUO concepts to create policies that …are then used to represent conditions under which datasets are available for use, conditions in requests to use them, and to generate agreements based on a compatibility matching between the two. We also address a current limitation of DUO regarding specifying information relevant to privacy and data protection law by using the Data Privacy Vocabulary (DPV) which supports expressing legal concepts in a jurisdiction-agnostic manner as well as for specific laws like the GDPR. Our work supports the existing socio-technical governance processes involving use of DUO by providing a complementary rather than replacement approach. To support this and improve DUO, we provide a description of how our system can be deployed with a proof of concept demonstration that uses ODRL rules for all DUO concepts, and uses them to generate agreements through matching of requests to data offers. All resources described in this article are available at: https://w3id.org/duodrl/repo . Show more
Keywords: Health data, biomedical ontologies, policy, regulatory compliance, GDPR
DOI: 10.3233/SW-243583
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-26, 2024
Authors: Ferrada, Sebastián | Bustos, Benjamin | Hogan, Aidan
Article Type: Research Article
Abstract: The SPARQL standard provides operators to retrieve exact matches on data, such as graph patterns, filters and grouping. This work proposes and evaluates two new algebraic operators for SPARQL 1.1 that return similarity-based results instead of exact results. First, a similarity join operator is presented, which brings together similar mappings from two sets of solution mappings. Second, a clustering solution modifier is introduced, which instead of grouping solution mappings according to exact values, brings them together by using similarity criteria. For both cases, a variety of algorithms are proposed and analysed, and use-case queries that showcase the relevance and usefulness …of the novel operators are presented. For similarity joins, experimental results are provided by comparing different physical operators over a set of real world queries, as well as comparing our implementation to the closest work found in the literature, DBSimJoin, a PostgreSQL extension that supports similarity joins. For clustering, synthetic queries are designed in order to measure the performance of the different algorithms implemented. Show more
Keywords: Similarity joins, clustering, SPARQL
DOI: 10.3233/SW-243540
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-32, 2024
Authors: Santos, Veronica | Schwabe, Daniel | Lifschitz, Sérgio
Article Type: Research Article
Abstract: In order to use a value retrieved from a Knowledge Graph (KG) for some computation, the user should, in principle, ensure that s/he trusts the veracity of the claim, i.e., considers the statement as a fact. Crowd-sourced KGs, or KGs constructed by integrating several different information sources of varying quality, must be used via a trust layer. The veracity of each claim in the underlying KG should be evaluated, considering what is relevant to carrying out some action that motivates the information seeking. The present work aims to assess how well Wikidata (WD) supports the trust decision process implied when …using its data. WD provides several mechanisms that can support this trust decision, and our KG Profiling, based on WD claims and schema, elaborates an analysis of how multiple points of view, controversies, and potentially incomplete or incongruent content are presented and represented. Show more
Keywords: Trust, contextual, KG Profiling
DOI: 10.3233/SW-243577
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-22, 2024
Authors: Bellucci, Matthieu | Delestre, Nicolas | Malandain, Nicolas | Zanni-Merk, Cecilia
Article Type: Research Article
Abstract: Debugging and repairing Web Ontology Language (OWL) ontologies has been a key field of research since OWL became a W3C recommendation. One way to understand errors and fix them is through explanations. These explanations are usually extracted from the reasoner and displayed to the ontology authors as is. In the meantime, there has been a recent call in the eXplainable AI (XAI) field to use expert knowledge in the form of knowledge graphs and ontologies. In this paper, a parallel between explanations for machine learning and for ontologies is drawn. This link enables the adaptation of XAI methods to explain …ontologies and their entailments. Counterfactual explanations have been identified as a good candidate to solve the explainability problem in machine learning. The CEO (Counterfactual Explanations for Ontologies) method is thus proposed to explain inconsistent ontologies using counterfactual explanations. A preliminary user study is conducted to ensure that using XAI methods for ontologies is relevant and worth pursuing. Show more
Keywords: Counterfactual explanations, explainability, ontology, knowledge graph, artificial intelligence
DOI: 10.3233/SW-243566
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-26, 2024
Authors: Dadalto, Atílio A. | Almeida, João Paulo A. | Fonseca, Claudenir M. | Guizzardi, Giancarlo
Article Type: Research Article
Abstract: The distinction between types and individuals is key to most conceptual modeling techniques and knowledge representation languages. Despite that, there are a number of situations in which modelers navigate this distinction inadequately, leading to problematic models. We show evidence of a large number of representation mistakes associated with the failure to employ this distinction in the Wikidata knowledge graph, which can be identified with the incorrect use of instantiation , which is a relation between an instance and a type, and specialization (or subtyping ), which is a relation between two types. The prevalence of the problems in Wikidata’s …taxonomies suggests that methodological and computational tools are required to mitigate the issues identified, which occur in many settings when individuals, types, and their metatypes are included in the domain of interest. We conduct a conceptual analysis of entities involved in recurrent erroneous cases identified in this empirical data, and present a tool that supports users in identifying some of these mistakes. Show more
Keywords: Wikidata, multi-level taxonomies, quality assessment
DOI: 10.3233/SW-243562
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-18, 2024
Authors: Troullinou, Georgia | Agathangelos, Giannis | Kondylakis, Haridimos | Stefanidis, Kostas | Plexousakis, Dimitris
Article Type: Research Article
Abstract: The explosion of the web and the abundance of linked data demand effective and efficient methods for storage, management, and querying. Apache Spark is one of the most widely used engines for big data processing, with more and more systems adopting it for efficient query answering. Existing approaches exploiting Spark for querying RDF data, adopt partitioning techniques for reducing the data that need to be accessed in order to improve efficiency. However, simplistic data partitioning fails, on one hand, to minimize data access and on the other hand to group data usually queried together. This is translated into limited improvement …in terms of efficiency in query answering. In this paper, we present DIAERESIS, a novel platform that accepts as input an RDF dataset and effectively partitions it, minimizing data access and improving query answering efficiency. To achieve this, DIAERESIS first identifies the top-k most important schema nodes, i.e., the most important classes, as centroids and distributes the other schema nodes to the centroid they mostly depend on. Then, it allocates the corresponding instance nodes to the schema nodes they are instantiated under. Our algorithm enables fine-tuning of data distribution, significantly reducing data access for query answering. We experimentally evaluate our approach using both synthetic and real workloads, strictly dominating existing state-of-the-art, showing that we improve query answering in several cases by orders of magnitude. Show more
Keywords: RDF, data partitioning, Spark, query answering
DOI: 10.3233/SW-243554
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-27, 2024
Authors: Hyvönen, Eero
Article Type: Research Article
Abstract: This paper presents a model and lessons learned for creating a cross-domain national ontology and Linked (Open) Data (LOD) infrastructure. The idea is to extend the global, domain agnostic “layer cake model” underlying the Semantic Web with domain specific and local features needed in applications. To test and demonstrate the infrastructure, a series of LOD services and portals in use have been created in 2002–2023 that cover a wide range of application domains. They have attracted millions of users in total suggesting feasibility of the proposed model. This line of research and development is unique due to its systematic national …level nature and long time span of over twenty years. Show more
Keywords: Semantic Web, Linked Data, ontologies, web services, infrastructures, portals, Digital Humanities
DOI: 10.3233/SW-243468
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-15, 2024
Authors: Confalonieri, Roberto | Kutz, Oliver | Calvanese, Diego | Alonso-Moral, Jose Maria | Zhou, Shang-Ming
Article Type: Editorial
Keywords: Explainable AI, symbolic knowledge, applied ontology
DOI: 10.3233/SW-243529
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-4, 2024
Authors: Bella, Giampaolo | Cantone, Domenico | Castiglione, Gianpietro | Nicolosi Asmundo, Marianna | Santamaria, Daniele Francesco
Article Type: Research Article
Abstract: Electronic commerce and finance are progressively supporting and including decentralized, shared and public ledgers such as the blockchain. This is reshaping traditional commercial activities by advancing them towards Decentralized Finance (DeFi) and Commerce 3.0, thereby supporting the latter’s potential to outpace the hurdles of central authority controllers and lawgivers. The quantity and entropy of the information that must be sought and managed to become active participants in such a relentlessly evolving scenario are increasing at a steady pace. For example, that information comprises asset or service description, general rules of the game, and specific technologies involved for decentralization. Moreover, …the relevant information ought to be shared among innumerable and heterogeneous stakeholders, such as producers, buyers, digital identity providers, valuation services, and shipment services, to just name a few. A clear semantic representation of such a complex and multifaceted blockchain-based e-Commerce ecosystem would contribute dramatically to make it more usable, namely more automatically accessible to virtually anyone wanting to play the role of a stakeholder, thereby reducing programmers’ effort. However, we feel that reaching that goal still requires substantial effort in the tailoring of Semantic Web technologies, hence this article sets out on such a route and advances a stack of OWL 2 ontologies for the semantic description of decentralized e-commerce. The stack includes a number of relevant features, ranging from the applicable stakeholders through the supply chain of the offerings for an asset, up to the Ethereum blockchain, its tokens and smart contracts. Ontologies are defined by taking a behaviouristic approach to represent the various participants as agents in terms of their actions, inspired by the Theory of Agents and the related mentalistic notions. The stack is validated through appropriate metrics and SPARQL queries implementing suitable competency questions, then demonstrated through the representation of a real world use case, namely, the iExec marketplace. Show more
Keywords: Ontology, OWL, Semantic Web, DeFi, agent, blockchain, Ethereum, e-commerce, supply chain, ONTOCHAIN, iExec
DOI: 10.3233/SW-243543
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-52, 2024
Authors: Flügel, Simon | Glauer, Martin | Neuhaus, Fabian | Hastings, Janna
Article Type: Research Article
Abstract: In ontology development, there is a gap between domain ontologies which mostly use the Web Ontology Language, OWL, and foundational ontologies written in first-order logic, FOL. To bridge this gap, we present Gavel, a tool that supports the development of heterogeneous ‘FOWL’ ontologies that extend OWL with FOL annotations, and is able to reason over the combined set of axioms. Since FOL annotations are stored in OWL annotations, FOWL ontologies remain compatible with the existing OWL infrastructure. We show that for the OWL domain ontology OBI, the stronger integration with its FOL top-level ontology BFO via our approach enables us …to detect several inconsistencies. Furthermore, existing OWL ontologies can benefit from FOL annotations. We illustrate this with FOWL ontologies containing mereotopological axioms that enable additional, useful inferences. Finally, we show that even for large domain ontologies such as ChEBI, automatic reasoning with FOL annotations can be used to detect previously unnoticed errors in the classification. Show more
Keywords: Ontology, heterogeneous ontology, first-order
DOI: 10.3233/SW-243440
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-16, 2024
Authors: Křemen, Petr | Med, Michal | Blaško, Miroslav | Saeeda, Lama | Ledvinka, Martin | Buzek, Alan
Article Type: Research Article
Abstract: Thesauri are popular, as they represent a manageable compromise – they are well-understood by domain experts, yet formal enough to boost use cases like semantic search. Still, as the thesauri size and complexity grow in a domain, proper tracking of the concept references to their definitions in normative documents, interlinking concepts defined in different documents, and keeping all the concepts semantically consistent and ready for subsequent conceptual modeling, is difficult and requires adequate tool support. We present TermIt, a web-based thesauri manager aimed at supporting the creation of thesauri based on decrees, directives, standards, and other normative documents. In addition to …common editing capabilities, TermIt offers term extraction from documents, including a web document annotation browser plug-in, tracking term definitions in documents, term quality and ontological correctness checking, community discussions over term meanings, and seamless interlinking of concepts across different thesauri. We also show that TermIt features better fit the E-government scenarios in the Czech Republic than other tools. Additionally, we present the feasibility of TermIt for these scenarios by preliminary user experience evaluation. Show more
Keywords: Thesaurus, ontology, SKOS, UFO
DOI: 10.3233/SW-243547
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-11, 2024
Authors: Vámos, Csilla | Scheider, Simon | Sonnenschein, Tabea | Vermeulen, Roel
Article Type: Research Article
Abstract: Exposure is a central concept of the health and behavioural sciences needed to study the influence of the environment on the health and behaviour of people within a spatial context. While an increasing number of studies measure different forms of exposure, including the influence of air quality, noise, and crime, the influence of land cover on physical activity, or of the urban environment on food intake, we lack a common conceptual model of environmental exposure that captures its main structure across all this variety. Against the background of such a model, it becomes possible not only to systematically compare …different methodological approaches but also to better link and align the content of the vast amount of scientific publications on this topic in a systematic way. For example, an important methodical distinction is between studies that model exposure as an exclusive outcome of some activity versus ones where the environment acts as a direct independent cause (active vs. passive exposure ). Here, we propose an information ontology design pattern that can be used to define exposure and to model its variants. It is built around causal relations between concepts including persons, activities, concentrations, exposures, environments and health risks. We formally define environmental stressors and variants of exposure using Description Logic (DL), which allows automatic inference from the RDF-encoded content of a paper. Furthermore, concepts can be linked with data models and modelling methods used in a study. To test the pattern, we translated competency questions into SPARQL queries and ran them over RDF-encoded content. Results show how study characteristics can be classified and summarized in a manner that reflects important methodical differences. Show more
Keywords: Ontology, epidemiology, RDF, GIS, computer science
DOI: 10.3233/SW-243546
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-29, 2024
Authors: Khan, M. Jaleed | G. Breslin, John | Curry, Edward
Article Type: Research Article
Abstract: Exploring the potential of neuro-symbolic hybrid approaches offers promising avenues for seamless high-level understanding and reasoning about visual scenes. Scene Graph Generation (SGG) is a symbolic image representation approach based on deep neural networks (DNN) that involves predicting objects, their attributes, and pairwise visual relationships in images to create scene graphs, which are utilized in downstream visual reasoning. The crowdsourced training datasets used in SGG are highly imbalanced, which results in biased SGG results. The vast number of possible triplets makes it challenging to collect sufficient training samples for every visual concept or relationship. To address these challenges, we propose …augmenting the typical data-driven SGG approach with common sense knowledge to enhance the expressiveness and autonomy of visual understanding and reasoning. We present a loosely-coupled neuro-symbolic visual understanding and reasoning framework that employs a DNN-based pipeline for object detection and multi-modal pairwise relationship prediction for scene graph generation and leverages common sense knowledge in heterogenous knowledge graphs to enrich scene graphs for improved downstream reasoning. A comprehensive evaluation is performed on multiple standard datasets, including Visual Genome and Microsoft COCO, in which the proposed approach outperformed the state-of-the-art SGG methods in terms of relationship recall scores, i.e. Recall@K and mean Recall@K, as well as the state-of-the-art scene graph-based image captioning methods in terms of SPICE and CIDEr scores with comparable BLEU, ROGUE and METEOR scores. As a result of enrichment, the qualitative results showed improved expressiveness of scene graphs, resulting in more intuitive and meaningful caption generation using scene graphs. Our results validate the effectiveness of enriching scene graphs with common sense knowledge using heterogeneous knowledge graphs. This work provides a baseline for future research in knowledge-enhanced visual understanding and reasoning. The source code is available at https://github.com/jaleedkhan/neusire . Show more
Keywords: Scene graph, image representation, common sense knowledge, knowledge enrichment, visual reasoning, image captioning
DOI: 10.3233/SW-233510
Citation: Semantic Web, vol. Pre-press, no. Pre-press, pp. 1-25, 2023
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl