Journal of Intelligent & Fuzzy Systems - Volume 36, issue 5 - Journals

Show:

results per page

Acoustic resonance spectroscopy based simple system for spectral characterization and classification of materials

Authors: Khan, Munna | Reza, Md Qaiser | Salhan, Ashok Kumar | Sirdeshmukh, Shaila P.S.M.A.

Article Type: Research Article

Abstract: The acoustic resonance spectroscopy is an accurate, precise, inexpensive, and non-destructive method for identification and quantification of materials. The acoustics based inspection methods used for classification of materials in the field of food, security, and healthcare is constrained by expensive instrumentation, complicated transducer coupling, etc. Hence, a simple, inexpensive, and portable system has been devised that acquires data quickly and classifies the materials. It has two piezoelectric transducers glued to both ends of the V-shaped quartz tube, one acting as a transmitter and another as a receiver. The transmitter generates vibration by white noise excitation. The receiver detects the resultant …signal after interaction with samples and recorded the acoustic signal with the help of a laptop and software. From analysis of power spectrum of signals acquired from each of the samples, seven resonant peaks were obtained. PCA analysis was carried out by selecting only two principal components as feature vectors for classification. The overall accuracy of the classifiers: LDA and Naive Bayes were 98.91% and 96.83% respectively. The classification accuracy of LDA for distilled water, sugar solution, and salt solution were found to be 100%, 98.5%, and 98.25% respectively, while the accuracy of the Naive Bayes classifier was 94%, 98.5%, and 98% respectively. The results show that the classification accuracy of LDA is better than Naive Bayes classifier. The datasets of the developed simple system show a significant capability in the classification of materials. Show more

Keywords: Acoustic resonance spectroscopy (ARS), acoustic signature, principal component analysis (PCA), linear discriminant analysis (LDA)

DOI: 10.3233/JIFS-169994

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4389-4397, 2019

Price: EUR 27.50

Smartphone based context-aware driver behavior classification using dynamic bayesian network

Authors: Chhabra, Rishu | Krishna, C. Rama | Verma, Seema

Article Type: Research Article

Abstract: Intelligent Transportation Systems (ITS) aim at reducing the risks associated with the transportation system as road accidents are becoming one of the primary causes of death in developing countries. Monitoring of driver behavior is one of the key areas of ITS and assists in vehicle safety systems. It has gained importance in order to reduce traffic accidents and ensure the safety of all the road users, from the drivers to the pedestrians. In this work, we present a context-aware system that considers the vehicle, driver and the environment for driver behavior classification as a safe or fatigue or unsafe driver …(representing any other unsafe driving behavior like a drunk driver, reckless driver etc.) using a Dynamic Bayesian Network (DBN). We have designed a questionnaire to obtain the influencing factors that decide safe, unsafe and fatigue driving behavior. The collected data has been analyzed using Statistical Package for Social Sciences (SPSS). It has been observed that several techniques in the past have been proposed for driver behavior classification or detection; which either use specialized sensors or hardware devices, inbuilt smartphone sensors (like a gyroscope, accelerometer, magnetometer and GPS etc.), complex sensor fusion algorithms and techniques to detect driver behavior. The novelty of our work lies in designing and developing a context-aware system based on Android smartphone; that considers the complete driving context (driver, vehicle and surrounding environment) and classifies the driver behavior using a DBN. In order to identify driver fatigue, results from the designed questionnaire and previous research studies have been used without the need for special hardware devices. A DBN that combines all the contextual information has been created using GeNIe Modeler. Learning of DBN has been carried out using the Expec-tation–Maximization (EM) algorithm. The real-time data for DBN learning and testing has been collected on Chandigarh-Patiala National Highway, India using an Android smartphone. The proposed system yields an overall classification accuracy of 80–83%.The focus of this paper is to develop a cost-effective context-aware driver behavior classification system, to promote ITS in developing countries. Show more

Keywords: DBN, driving behavior, intelligent transportation systems, sensors, smartphone

DOI: 10.3233/JIFS-169995

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4399-4412, 2019

Price: EUR 27.50

Intelligent navigation of multiple coordinated robots

Authors: Pradhan, Buddhadeb | Vijayakumar, V. | Hui, Nirmal Baran | Sinha Roy, Diptendu

Article Type: Research Article

Abstract: Navigation of multiple robots is a challenging task, particularly for many robots, since individual gains may more often than not adversely affect global gain. This paper investigates the problem of multiple robots moving towards individual goals within a common workspace without colliding amongst themselves. Two solutions for coordination namely Fuzzy Logic Controller (FLC) and Genetic Algorithm based FLC (GA-FLC) have been employed and the efficacy of cooperation strategies have been compared with their non-cooperative counterparts as well as with the fundamental potential field method (PFM). Proposed coordination schemes are verified through simulations. A total of 100 scenarios are considered varying …the number of robots (8, 12, 16 and 20). The obtained results show the efficacy of the proposed schemes. Show more

Keywords: Multi-agent systems, motion planning, coordination

DOI: 10.3233/JIFS-169996

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4413-4423, 2019

Price: EUR 27.50

Real time FPGA-ANN architecture for outdoor obstacle detection focused in road safety

Article Type: Research Article

Abstract: Object detection is a technologically challenging issue, which is useful for safety in outdoor environments, where this object, frequently, represents an obstacle that must be avoided. Although several object detection methods have been developed in recent years, they usually tend to produce poor results in outdoor environments, being mainly affected by sunlight, light intensity, shadows, and limited computational resources. This open problem is the main motivation for exploring the challenge of developing low-cost object detection solutions, with the characteristic of being easily adaptable and having low power requirements, such as the ones needed in on-board obstacle detection systems in automobiles. …In this work, we present a trade-off analysis of several architectures using an FPGA-based design that implements ANNs (FPGA-ANN) for outdoor obstacle detection, focused in road safety. The analyzed FPGA-ANN architectures merge outdoor data gathered by a Kinect sensor, images and infrared data, to construct an outdoor environment model for object detection, which allows to detect if there is an obstacle in the near surroundings of a vehicle. Show more

Keywords: Obstacle detection, artificial neural networks, FPGA implementation, architecture trade-off analysis, road safety

DOI: 10.3233/JIFS-169997

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4425-4436, 2019

Price: EUR 27.50

Adaptive firefly algorithm based optimized key generation for image security

Authors: Sinha, Rupesh Kumar | Sahu, S.S.

Article Type: Research Article

Abstract: Cryptography is the most peculiar way to secure data and most of the encryption algorithms are mainly used for textual data and not suitable for transmission data such as images. It is seen that the generation of secure key in Image cryptography has been a challenging task in the way of providing secured key generation for the transmitted data. In order to aid secured key generation in this context, an optimized secret key generation based on Chebyshev polynomial with Adaptive Firefly (FF) optimization technique is proposed. The optimized key is utilized with process of shuffling, diffusion, and swapping to get …a better encrypted image. At the receiver end, reverse process is applied with optimized key to retrieve the original input image. The efficiency of our proposed method is assessed by the exhaustive experimental study. The results show that the proposed methodology provided correlation coefficient of 0.21, Number of Pixels Change Rate (NPCR) of 0.996, Unified Average Changing Intensity (UACI) of 0.3346 and Information Entropy of 7.995 as compared with the existing methods. Show more

Keywords: Encrypted image, DWT, Chebyshev polynomial, optimized secret key, Adaptive firefly (FF) optimization algorithm

DOI: 10.3233/JIFS-169998

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4437-4447, 2019

Price: EUR 27.50

HMAC-RSA: A security mechanism in cognitive radio for enhancing the security in a radio cognitive system

Authors: Srinivasan, Sundar | ShivaKumar, K.B. | Muazzam, Mohammad

Article Type: Research Article

Abstract: A cognitive radio (CR) can be programmed and configured dynamically to use best wireless channels. Such a radio automatically detects available channels in wireless spectrum, and then accordingly changes its transmission. The CR system consists of primary user or licensed user and secondary user or unlicensed user. The security attacks such as active attack and passive attack are identified between primary user and secondary user and packet loss occurs during packet transmission. The security problem occurring while transmission of signal between primary user and secondary user is rectified by using a hybrid RSA (Riverest, Shaimer and Adleman) and HMAC (Hash …Message Authentication Code) algorithms where former is used for key generation and latter is used for tag generation which is sent along with signal. Additionally packet loss incurred in system incurs is reduced with aid of Markov Chain Model during transmission. The comparison results provided showefficiency of the proposed algorithm in cognitive radio system in terms of parameters such as throughput, encryption time, decryption time, Packet Delivery Ratio and energy consumption. Show more

Keywords: Cognitive radio, RSA (Riverest, Shaimer and Adleman), HMAC (Hash Message Authentication Code), Markov Chain Model, active attack, passive attack

DOI: 10.3233/JIFS-169999

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4449-4459, 2019

Price: EUR 27.50

Evolutionary algorithm based control strategy for enhanced operation of multifunction grid connected converters

Authors: Vijayakumar, K. | Rajesh, K. | Vishnuvardhanan, G. | Kannan, S.

Article Type: Research Article

Abstract: The Distributed Generation (DG) systems are highly useful in recent days for increasing the penetration of renewable energy, in which the design of grid connected inverters is one of the demanding and challenging task. For this reason, different controller strategies are developed in the traditional works for controlling the inverters with increased efficiency. But, it has the major limitations of increased computational complexity, steady state error and reduced compensation capability. To solve these issues, this research work aims to design a new controller by implementing a novel Monkey King Evolution Algorithm (MKEA) for grid connected converters. The motive of this …work is to increase the overall effectiveness of the power system by controlling the inverter without affecting its output. Also, it aims to provide a secure and convenient controller for the power converters. Here, the information that is obtained from the system which includes real power, distorted power due to load, reactive power of load, and apparent power of inverter are taken as the input. Later, the four numbers of monkeys are initialized, which evaluates the best solution based on these parameters. Sequentially, the monkey king obtains the best solutions from the monkeys, using which the most suitable and best solution for taking the decision is selected. Based on this, the reference current is generated by performing the voltage regulation, and abc to dq0 transformation processes. During simulation, the efficiency of the controller is analyzed by using the measures of phase voltage, phase current, active power, reactive power, apparent power, grid voltage, and output voltage. The Total Harmonic Distortion (THD) is effectively reduced by using the MKEA based controller design. Extensive simulation and experimental results are presented to validate the effectiveness of the proposed controller and control strategy. Show more

Keywords: Grid connected inverters, Distributed Generation System, Monkey King Evolution Algorithm (MKEA), Photovoltaic (PV) System, controller design, reference current generation

DOI: 10.3233/JIFS-179000

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4461-4478, 2019

Price: EUR 27.50

Recommendations with context aware framework using particle swarm optimization and unsupervised learning

Authors: Jain, Parul | Dixit, Veer Sain

Article Type: Research Article

Abstract: Context aware recommender system has become an area of rigorous research attributing to incorporate context features, thereby increases accuracy while making recommendations. Most of the researches have proved neighborhood based collaborative filtering to be one of the most efficient mechanisms in recommender systems because of its simplicity, intuitiveness and wide usage in commercial domains. However, the basic challenges observed in this area include sparsity of data, scalability and utilization of contexts effectively. In this study, a novel framework is proposed to generate recommendations independently of the count and type of context dimensions, hence pertinent for real life recommender systems. In …the framework, we have used k -prototype clustering technique to group contextually similar users to get a reduced and effective set. Additionally, particle swarm optimization technique is applied on the closest cluster to find the contribution of different context features to control data sparsity problem. Also, the proposed framework employs an improved similarity measure which considers contextual condition of the user. The results came from the series of experiments using two context enriched datasets showcasing that the proposed framework increases the accuracy of recommendations over other techniques from the same domain without consuming extra cost in terms of time. Show more

Keywords: Collaborative filtering, unsupervised learning, particle swarm optimization, euclidean distance, context aware recommendations

DOI: 10.3233/JIFS-179001

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4479-4490, 2019

Price: EUR 27.50

Color image share cryptography: a novel approach

Authors: Hasnat, Abul | Barman, Dibyendu | Sarkar, Suchintya

Article Type: Research Article

Abstract: Shared visual cryptography is a method to protect image-based secrets where an image is kept as multiple shares having less computational decoding process. Steganography is a technique to hide secret data in some carrier like-audio, image etc. Steganography technique is categorized into four categories. i) Spatial Domain Technique- Image pixel values are converted into binary and some of the binary values changed to hide secret data. ii) Transform Domain Technique- the message is hidden in cover image and then it is transformed in the frequency domain. iii) Distortion Technique-information is stored by changing the value of the pixel. iv)Visual Cryptography …Technique-Image is broken into two or more parts called shares. This article proposes a hybrid visual crypto-steganography approach which exploits the advantages of both approaches to protect image based secret in communication. Most of the visual cryptography is applied on black and white images but the proposed method can be applied directly on color images having three channels. This method does not change the image size. Also an exact replica of original image can be reconstructed therefore this process does not result in image quality degradation. This article proposes novel color image share cryptography where seven shares are generated from one color image (correlated/de-correlated color space). These shares are sent to the receiver and original image is reconstructed using all those shares. Share generation and image reconstruction is based on simple operation like pixel shuffling, reversing binary string of the image information, ratio of pixel intensity values. Row key matrix and column key matrix are generated using random function. Pixel positions are shuffled using these two key matrixes. These seven shares namely Row Key, Column Key, Remainder matrix, Quotient matrix, R ratio matrix, G ratio matrix and B ratio matrix are generated. Then Row key matrix, Column key matrix, Remainder matrix, Quotient matrix and three ratio matrices are hidden into separate cover images by LSB encoding technique and sent over the network. Receiver can reconstruct the image if all shares are available only. The proposed method is applied on standard images in the literature and images captured using standard digital camera. Comparison study with existing methods shows that the proposed method performs better in terms of NIST metrics. The method has many applications in the area of visual cryptography, shared cryptography, image based authentication etc. Show more

Keywords: Binary image, Cryptography, GCD, image decryption, image encryption, image security, quotient, remainder, shared visual cryptography, steganography

DOI: 10.3233/JIFS-179002

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4491-4506, 2019

Price: EUR 27.50

An intelligent system to detect human suspicious activity using deep neural networks

Authors: Ramachandran, Sumalatha | Palivela, Lakshmi Harika

Article Type: Research Article

Abstract: The importance of the surveillance is increasing every day. Surveillance is monitoring of activities, behavior and other changing information. An intelligent automatic system to detect behavior of the human is very important in public places. For this necessity, a framework is proposed to detect suspicious human behavior as well as tracking of human who is doing some unusual activity such as fighting and threatening actions and also distinguishing the human normal activities from the suspicious behavior. The human activity is recognized by extracting the features using the convolution neural network (CNN) on the extracted optical flow slices and pre-training the …activities based on the real-time activities. The obtained learned feature creates a score for each input which is used to predict the type of activity and it is classified using multi-class support vector machine (MSVM). This improved design will provide better surveillance system than existing. Such system can be used in public places like shopping mall, railway station or in a closed environment such as ATM where security is the prime concern. The performance of the system is evaluated, by using different standard datasets having different objects and achieved 95% performance as explained in experimental analysis. Show more

Keywords: Suspicious activity detection, optical flow, convolutional neural networks, support vector machine, multi-class SVM

DOI: 10.3233/JIFS-179003

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4507-4518, 2019

Price: EUR 27.50

Scalable hybrid and ensemble heuristics for economic virtual resource allocation in cloud and fog cyber-physical systems

Authors: Jangiti, Saikishor | Sri Ram, E. | Ravi, Logesh | Sriram, V.S. Shankar

Article Type: Research Article

Abstract: With the advent of cloud computing, a cost-effective and reliable choice to employ IT infrastructure, the cyber-physical systems (CPS) are transforming into loosely coupled cloud and fog CPS. The sensor information from physical processes at CPS is continuously processed by fog computing nodes and is forwarded for advanced data analytics offered as a service from the cloud. The computation offloaded by fog devices are initiated as Virtual Machines (VMs) in the cloud data center. The effective placement of these VMs into minimum Physical Machines (PMs) involves economic and environmental issues. Recent research works signify the use of First-Fit Decreasing (FFD) …based heuristic techniques to address this NP-Hard problem as a vector bin-packing problem. In this research work, we present a set of hybrid heuristics and an ensemble heuristic to improve the solution quality. The simulation results show that the proposed heuristics are highly scalable and economical in comparison with the individual heuristic-based approaches. Show more

Keywords: cyber-physical systems, fog computing, cloud computing, virtual machine placement, first-fit decreasing

DOI: 10.3233/JIFS-179004

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4519-4529, 2019

Price: EUR 27.50

A testbed validated simple time synchronization protocol for clustered wireless sensor networks for IoT

Authors: Chalapathi, G.S.S. | Chamola, Vinay | Gurunarayanan, S.

Article Type: Research Article

Abstract: Wireless Sensor Networks (WSNs) are set to play an important role in the Internet of Things (IoT). WSNs are deployed for many IoT applications like Smart-Street Lighting, Smart-Grid, etc. Time Synchronization Protocol (TSP) is an important protocol in WSNs and it is used for many of its operations. Most of the existing TSPs for WSNs are simulation-based works, which do not fully prove their effectiveness for WSNs. Further, the Line-of-Sight (LOS) conditions in which the WSN nodes are deployed can significantly affect the performance of these TSPs. However, most of the existing protocols neither talk about the LOS conditions in …which these protocols were tested nor prove their effectiveness for different LOS conditions. To address these aspects, a synchronization protocol for cluster-based WSNs called a Simple Hierarchical Algorithm for Time Synchronization (H-SATS) has been proposed in this work and its performance is tested on a densely deployed large-sized WSN testbed in different LOS conditions. Further, H-SATS has been compared with the traditional regression-based method, which is the core synchronization scheme for different synchronization protocols in clustered WSNs. Experiments show that H-SATS outperforms the regression method in terms of synchronization accuracy to a maximum of 26.7% for a 30-node network. Show more

Keywords: Cluster-based topology WSN, line-of-sight (LOS) conditions, non-line-of-sight (NLOS) condition, time synchronization protocol, wireless sensor networks (WSN)

DOI: 10.3233/JIFS-179005

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4531-4543, 2019

Price: EUR 27.50

Special Section: Intelligent and Fuzzy Systems applied to Language & Knowledge Engineering, Guest Editors: David Pinto and Vivek Singh

Article Type: Other

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4545-4545, 2019

Get PDF

Intelligent and fuzzy systems applied to language & knowledge engineering

Authors: Pinto, D. | Singh, V.

Article Type: Editorial

DOI: 10.3233/JIFS-179006

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4547-4552, 2019

Get PDF

Prediction of reading difficulty in Russian academic texts

Authors: Solovyev, Valery | Solnyshkina, Marina | Ivanov, Vladimir | Batyrshin, Ildar

Article Type: Research Article

Abstract: Education policy makers view measuring academic texts readability and profiling classroom textbooks as a primary task of education management aimed at sustaining quality of reading programs. As Russian readability metrics, i.e. “objective” features of texts determining its complexity for readers, are still a research niche, we undertook a comparative analysis of academic texts features exemplified in textbooks on Social Science and examination texts of Russian as a foreign language. Experiments for 7 classifiers and 4 methods of linear regression on Russian Readability corpus demonstrated that ranking textbooks for native speakers is a much more difficult task than ranking examination texts …written (or designed) for foreign students. The authors see a possible reason for this in differences between two processes: acquiring a native language on the one hand and learning a foreign language on the other. The results of the current study are extremely relevant in modern Russia which is joining the Bologna Process and needs to provide profiled texts for all types of learners and testees. Based on a qualitative and quantitative analysis of a text, the research offers a guide for education managers to help build consensus on selecting a reading material when educators have differing views. Show more

Keywords: Text readability, machine learning, Russian academic text, text complexity, examination tests

DOI: 10.3233/JIFS-179007

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4553-4563, 2019

Price: EUR 27.50

A corpus for argument analysis of academic writing: argumentative paragraph detection

Authors: Garcia-Gorrostieta, Jesús Miguel | López-López, Aurelio

Article Type: Research Article

Abstract: Academic writing is a complex task which requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill difficult to master. Aiming to contribute to develop this skill, we present a freely available annotated corpus to support research in argumentation in Spanish. To build it, we elaborated an annotation guide to identify argumentation in paragraphs. The guide also specified how to determine segments of sentences as a claim or premise, and to …indicate relations (support or attack) between such segments. Then, an annotated corpus of 300 sections was created. After its construction, the corpus was used to perform an exploratory analysis which aimed to identify and present the amount of argumentation in each section, as well as resulting patterns for argument identification. Hence, we also report an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. The results of the experiments to evaluate argumentative paragraph detection were encouraging. In addition, we discuss a web-based prototype for argument detection in paragraphs to reach the broader academic community of students, instructors and researchers. Show more

Keywords: Argumentation, academic writing, annotated theses corpus, argumentative paragraph detection, argument markers

DOI: 10.3233/JIFS-179008

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4565-4577, 2019

Price: EUR 27.50

An unsupervised method for automatic validation of verbal phraseological units

Authors: Sánchez, Belém Priego | Pinto, David

Article Type: Research Article

Abstract: In this paper we present an unsupervised technique for validating the existence of verbal phraseological units in raw text. This technique employs the concept of internal and contextual attraction which basically considers a mathematical formula based on co-occurrence of terms inside and outside of the terms considered to be part of a verbal phraseological unit. The experiments carried out using a corpus of news stories report a 60% of accuracy, which highlights the challenging task of automatic validation of verbal phraseological units in raw texts.

Keywords: Unsupervised methods, term co-occurrence, phraseological units

DOI: 10.3233/JIFS-179009

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4579-4585, 2019

Price: EUR 27.50

A Lexical Search Model based on word association norms

Authors: Reyes-Magaña, Jorge | Bel-Enguix, Gemma | Gómez-Adorno, Helena | Sierra, Gerardo

Article Type: Research Article

Abstract: This work introduces a lexical search model based on a type of knowledge graphs, namely word association norms. The aim of the search is to retrieve a target word, given the description of a concept, i.e., the query. This differs from traditional information retrieval models were complete documents related to the query are retrieved. Our algorithm looks for the keywords of the definition in a graph, built over a corpus of word association norms for Mexican Spanish, and computes the centrality in order to find the relevant concept. We performed experiments over a corpus of human-definitions in order to evaluate …our model. The results are compared with a Boolean information retrieval (IR) model, the BM25 text-retrieval algorithm, an algorithm based on word vectors and an online onomasiological dictionary–OneLook Reverse Dictionary. The experiments show that our lexical search method outperforms the IR models in our study case. Show more

Keywords: Information retrieval, word association norms, natural language graphs, lexical search

DOI: 10.3233/JIFS-179010

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4587-4597, 2019

Price: EUR 27.50

Siamese hierarchical attention networks for extractive summarization

Authors: González, José-Ángel | Segarra, Encarna | García-Granada, Fernando | Sanchis, Emilio | Hurtado, Llu’ıs-F.

Article Type: Research Article

Abstract: In this paper, we present an extractive approach to document summarization based on Siamese Neural Networks. Specifically, we propose the use of Hierarchical Attention Networks to select the most relevant sentences of a text to make its summary. We train Siamese Neural Networks using document-summary pairs to determine whether the summary is appropriated for the document or not. By means of a sentence-level attention mechanism the most relevant sentences in the document can be identified. Hence, once the network is trained, it can be used to generate extractive summaries. The experimentation carried out using the CNN/DailyMail summarization corpus shows the …adequacy of the proposal. In summary, we propose a novel end-to-end neural network to address extractive summarization as a binary classification problem which obtains promising results in-line with the state-of-the-art on the CNN/DailyMail corpus. Show more

Keywords: Siamese neural networks, hierarchical attention networks, automatic text summarization

DOI: 10.3233/JIFS-179011

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4599-4607, 2019

Price: EUR 27.50

An evolutionary logistic regression method to identify confused drug names

Authors: Millán-Hernández, Christian Eduardo | García-Hernández, René Arnulfo | Ledeneva, Yulia

Article Type: Research Article

Abstract: Confused drug names are a common cause of medication errors, and are related to look-alike and sound-alike drug names. For the problem of identifying confused drug name pairs, individual similarity measures are used between the drug names. In the state-of-art, a logistic regression with the standard learning algorithm has been used to combine individual similarity measures. However, only three similarity measures have been combined but the results of previous research do not outperform with a statistical significance to any individual measure. In addition, the problem of potential confused drug names pairs presents a high unbalanced distribution of dataset that it …is a hard problem to supervised machine learning models. In this paper, an improved combined logistic regression measure based on 21 individual measures is presented with the standard learning algorithm. Also, we present an evolutionary learning method for a combined logistic regression measure that allows to learn an unbalanced dataset. According to the experimentation with a gold standard dataset, our proposed combined measures outperform previous research with a statistical significance to identify pairs of confused drug names. In addition, the rankings of individual and combined similarity measures are presented. Show more

Keywords: Look-alike sound-alike drug names, patient safety, logistic regression, genetic algorithm, imbalanced dataset.

DOI: 10.3233/JIFS-179012

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4609-4619, 2019

Price: EUR 27.50

Providing order to the handwritten TLS task: A complexity index

Authors: García-Calderón, Miguel Ángel | García-Hernández, René Arnulfo | Ledeneva, Yulia

Article Type: Research Article

Abstract: Text Line Segmentation (TLS) methods are intended to locate and separate text lines in document images for different stages of image analysis such as word spotting, keyword search, text alignment, text recognition and other stages of indexation involved in the retrieval of information from handwritten documents. The design of the proposed methods for the TLS and the tuning of their parameters assume a level of complexity according to the language and the writing style of a document collection. Therefore, the performance of these methods is not maintained against documents of greater or lesser complexity. In this paper, we present TLS-ICI, …a TLS Intrinsic Complexity Index that allows measuring the complexity of a document for the TLS task, without the necessity of a human gold standard. Through experimentation, we demonstrate how our proposed TLS-ICI provides an order to both the TLS methods and the image-based handwritten documents. In this way, with our proposed complexity index it is possible to select the most appropriated method for each document of a collection, reducing the time spent in exhaustive tests and increasing the performance. In addition, we demonstrate through a new hybrid TLS method that the TLS-ICI outperforms previous individual TLS methods. The dataset consists of several standard TLS collections of contemporary and ancient texts from different languages and alphabets such as English, Spanish, Arabic, and Chinese, Greek, Khmer, Persian, Bengali, Oriya, Kannada and Nahuatl. Show more

Keywords: Visual complexity in handwritten documents, handwritten text line segmentation, text line segmentation, document image processing, projection profile, historical documents, multilingual document analysis, handwritten recognition

DOI: 10.3233/JIFS-179013

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4621-4631, 2019

Price: EUR 27.50

Medical events extraction to analyze clinical records with conditional random fields

Authors: Fócil-Arias, Carolina | Sidorov, Grigori | Gelbukh, Alexander

Article Type: Research Article

Abstract: The rapid growth in the extraction of clinical events from unstructured clinical records has raised considerable challenges. In this paper, we propose the use of different features with a statical modeling method called conditional random fields, which is consider an algorithm for effectively solving problems of sequence tagging. Our goal is to determine which feature selection can affect the performance of four subtasks presented in SemEval Task-12: Clinical TempEval 2016. We applied a careful preprocessing, where the proposed method was tested on real clinical records from Task-12: Clinical TempEval 2016. The comparative analyses obtained indicate that our proposal achieves good …results compared to the work presented in Task-12: Clinical TempEval 2016 challenges. Show more

Keywords: Clinical reports, medical information extraction, natural language processing, machine learning, feature selection, conditional random fields

DOI: 10.3233/JIFS-179014

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4633-4643, 2019

Price: EUR 27.50

Scalable text semantic clustering around topics

Authors: Brena, Ramon | Ramirez, Eduardo

Article Type: Research Article

Abstract: Detection of topics in Natural Language text collections is an important step towards flexible automated text handling, for tasks like text translation, summarization, etc. In the current dominant paradigm to topic modeling, topics are represented as probability distributions of terms. Although such models are theoretically sound, their high computational complexity makes them difficult to use in very large scale collections. In this work we propose an alternative topic modeling paradigm based on a simpler representation of topics as overlapping clusters of semantically similar documents, that is able to take advantage of highly-scalable clustering algorithms. Our Query-based Topic Modeling framework (QTM) …is an information-theoretic method that assumes the existence of a “golden” set of queries that can capture most of the semantic information of the collection and produce models with maximum “semantic coherence”. QTM was designed with scalability in mind and was executed in parallel using a Map-Reduce implementation; further, we show complexity measures that support our scalability claims. Our experiments show that the QTM can produce models of comparable or even superior quality than those produced by state of the art probabilistic methods. Show more

Keywords: Topics NLP clustering queries

DOI: 10.3233/JIFS-179015

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4645-4657, 2019

Price: EUR 27.50

A quantitative and text-based characterization of big data research

Authors: Gupta, Vedika | Singh, Vivek Kumar | Ghose, Udayan | Mukhija, Pankaj

Article Type: Research Article

Abstract: This paper tries to map the research work carried out in the field of Big Data through a detailed analysis of scholarly articles published on the theme during 2010-16, as indexed in Scopus. We have collected and analyzed all relevant publications on Big Data, as indexed in Scopus, through a quantitative as well as textual characterization. The analysis attempts to dwell into parameters like research productivity, growth of research and citations, thematic trends, top publication sources and emerging topics in this field. The analytical study also investigates country-wise publications output and impact in terms of average citations per paper, country-level …collaboration patterns, authorship and leading contributors (countries, institutions) etc. The scholarly publication data is also subjected to a detailed textual analysis method to identify key themes in Big Data research, disciplinary variations and thematic trends and patterns. The results produce interesting inferences. Quantitative measures show that there has been a tremendous increase in number of publications related to Big Data during last few years. Research work in Big Data, though primarily considered a sub-discipline of Computer Science, is now carried out by researchers in many disciplines. Thematic analysis of publications in Big Data show that it’s a discipline involving research interest from fields as diverse as Medicine to Social Sciences. The paper also identifies major keywords now associated with Big Data research such as Cloud Computing, Deep Learning, Social Media and Data Analytics. This helps in a thorough understanding and visualization of the Big Data research area. Show more

Keywords: Big data, big data analytics, data science, scientometrics

DOI: 10.3233/JIFS-179016

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4659-4675, 2019

Price: EUR 27.50

Locality-sensitive hashing of permutations for proximity searching

Authors: Figueroa, Karina | Camarena-Ibarrola, Antonio | Valero-Elizondo, Luis | Reyes, Nora

Article Type: Research Article

Abstract: Similarity searching is the core of many applications in artificial intelligence since it solves problems like nearest neighbor searching. A common approach to similarity searching consists in mapping the database to a metric space in order to build an index that allows for fast searching. One of the most powerful searching algorithms for high dimensional data is known as the permutation based algorithm (PBA) . However, PBA has to collect the most similar permutations to a given query’s permutation. In this paper, how to speed up this process by proposing several novel hash functions for Locality Sensitive Hashing (LSH) …with PBA is shown. As a matter of fact, at searching our technique allows discarding up to 50% of the database to answer the query with a candidate list obtained in constant time. Show more

Keywords: Nearest neighbor, similarity searching, metric spaces

DOI: 10.3233/JIFS-179017

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4677-4684, 2019

Price: EUR 27.50

Binary vector transformation of math formula for mathematical information retrieval

Authors: Pathak, Amarnath | Pakray, Partha | Gelbukh, Alexander

Article Type: Research Article

Abstract: Scientific documents, which are majorly constituted of math formulae, form a primary source of scientific and technical information. However, the indexing and the search processes of conventional search engines barely account for mathematical contents of such documents. Though the recent past has witnessed a surge in number of Mathematical Information Retrieval (MIR) systems intending to retrieve math formulae from scientific documents, the low values of their evaluation measures are indicative of the scope for improvement. To cope with the challenges of MIR, and to further the performance of state-of-the-art systems, a novel approach, called Binary Vector Transformation of Math Formula …(BVTMF), is introduced. The implemented system extracts MathML formulae from the documents, preprocesses them, and renders them into fairly large-sized binary vectors (vectors of ‘0’s and ‘1’s). Generated formula vector is representative of the information content of corresponding formula. For indexing and searching text contents, the system relies on Apache Lucene. Text and math search results retrieved by independent text and math sub-systems are re-ranked to prioritize the results containing text as well as math components of the user query. Quality of the retrieved search results and appreciable values of the evaluation measures substantiate competence of the proposed approach. Show more

Keywords: Mathematical information retrieval, binary vector transformation, math formula search, scientific document retrieval, precision, bit position information table

DOI: 10.3233/JIFS-179018

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4685-4695, 2019

Price: EUR 27.50

Choosing the right loss function for multi-label Emotion Classification

Authors: Hurtado, Lluís-F. | González, José-Ángel | Pla, Ferran

Article Type: Research Article

Abstract: Natural Language Processing problems has recently been benefited for the advances in Deep Learning. Many of these problems can be addressed as a multi-label classification problem. Usually, the metrics used to evaluate classification models are different from the loss functions used in the learning process. In this paper, we present a strategy to incorporate evaluation metrics in the learning process in order to increase the performance of the classifier according to the measure we are interested to favor. Concretely, we propose soft versions of the Accuracy, micro-F 1 , and macro-F 1 measures that can be used as loss …functions in the back-propagation algorithm. In order to experimentally validate our approach, we tested our system in an Emotion Classification task proposed at the International Workshop on Semantic Evaluation, SemEval-2018. Using a Convolutional Neural Network trained with the proposed loss functions we obtained significant improvements both for the English and the Spanish corpora. Show more

Keywords: Deep Learning, loss function, multi-label classification, Natural Language Processing, Emotion Classification

DOI: 10.3233/JIFS-179019

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4697-4708, 2019

Price: EUR 27.50

Predicting emotional intensity in social networks

Authors: Rodríguez, Fernando M. | Garza, Sara E.

Article Type: Research Article

Abstract: Emotions, which are now commonly portrayed in social media, play a fundamental role in decision making. Having this into account, this work proposes a model to predict (forecast) emotions in social networks. This model specifically predicts, for a user, the proportion of comments that will be published with a particular emotion; this proportion is defined as an emotional intensity of the user in a particular time period. On the contrary of other models, which are focused on a single emotion, the proposed model considers a basic scheme of four emotions and employs these in an interdependent manner. The model, …moreover, utilizes three types of features: (1) user-related, (2) contact-related, and (3) environment-related. Prediction is performed using linear regression. Nearly 20 models, including ARIMA, are outperformed by the proposed model (with statistically significant results) when evaluated over a dataset extracted from Twitter. Some potential applications include massive opinion monitoring and recommendations to improve the emotional wellness of social media users (for example, the recommendation of joyful memories). Show more

Keywords: Prediction, emotion, machine learning, Twitter, social networks

DOI: 10.3233/JIFS-179020

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4709-4719, 2019

Price: EUR 27.50

Aspect-based sentiment analysis of mobile reviews

Authors: Gupta, Vedika | Singh, Vivek Kumar | Mukhija, Pankaj | Ghose, Udayan

Article Type: Research Article

Abstract: E-commerce websites provide an easy platform for users to put forth their viewpoints on different topics-ranging from a news item to any product in the market. Such online content encourages authors to express opinions on various aspects of an entity. Aspect based sentiment analysis deals with analyzing this textual content to look for the aspect in question. After locating the aspects, corresponding sentiment bearing words are looked for. This paper describes an integrated system that generates the opinionated aspect based graphical and extractive summaries from a large set of mobile reviews. The system focuses on three tasks (a) identification of …aspects in given field, (b) computation of sentiment polarity of each aspect, and (c) generates opinionated aspect based graphical and extractive summaries. The system has been evaluated on three mobile-reviews dataset and obtains better precision and recall than baseline approach. The system generates summaries from reviews without any training. Show more

Keywords: Aspect-based sentiment analysis, extractive summary, sentiment summarization

DOI: 10.3233/JIFS-179021

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4721-4730, 2019

Price: EUR 27.50

Predicting the helpfulness of game reviews: A case study on the Steam store

Authors: Baowaly, Mrinal Kanti | Tu, Yi-Pei | Chen, Kuan-Ta

Article Type: Research Article

Abstract: Online user reviews play an important role in the assessment of product quality, and thus these reviews should be evaluated carefully. This study evaluates the helpfulness of game reviews on the online Steam store. It collects a large set of user reviews of different game genres and builds a classification model to predict whether these reviews are helpful or not. This model can accurately predict the helpfulness of the reviews based on different thresholds. This work also investigates various types of textual and word embedding features and analyzed their importance for predictions. Furthermore, it develops a regression-based model that can …predict the score or rating of game reviews on Steam. Show more

Keywords: Steam, online review, review helpfulness, semantic analysis, word embedding

DOI: 10.3233/JIFS-179022

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4731-4742, 2019

Price: EUR 27.50

Online Hate Speech against Women: Automatic Identification of Misogyny and Sexism on Twitter

Authors: Frenda, Simona | Ghanem, Bilal | Montes-y-Gómez, Manuel | Rosso, Paolo

Article Type: Research Article

Abstract: Patriarchal behavior, such as other social habits, has been transferred online, appearing as misogynistic and sexist comments, posts or tweets. This online hate speech against women has serious consequences in real life, and recently, various legal cases have arisen against social platforms that scarcely block the spread of hate messages towards individuals. In this difficult context, this paper presents an approach that is able to detect the two sides of patriarchal behavior, misogyny and sexism, analyzing three collections of English tweets, and obtaining promising results.

Keywords: Misogyny detection, sexism detection, linguistic analysis

DOI: 10.3233/JIFS-179023

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4743-4752, 2019

Price: EUR 27.50

Similarity metrics analysis for principal concepts detection in ontology creation

Authors: Alemán, Yuridiana | Somodevilla, María J. | Vilariño, Darnes

Article Type: Research Article

Abstract: In this paper an analysis, based on similarity metrics, was carried out in order to detect main concepts related to the superclasses in a pedagogical domain ontology. A semi-automatic corpus containing articles in Spanish was built. Afterward, the corpus was lemmatized and three representations were extracted. Four textual similarity metrics based on terms and Pointwise Mutual Information were implemented. A list of words, which was evaluated using a gold standard built by an expert in the domain, was retrieved from each experiment according to establish thresholds for the metrics. Precision and recall were used for evaluation step, where a detailed …discussion by representation and class was presented. Results showed a higher precision in types of intelligences class and 5-grams representation. Show more

Keywords: Ontology learning, pedagogical domain, NLP.

DOI: 10.3233/JIFS-179024

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4753-4764, 2019

Price: EUR 27.50

Rule-based expert system for detection of coffee rust warnings in colombian crops

Authors: Buitrón, Edwar Javier Girón | Corrales, David Camilo | Avelino, Jacques | Iglesias, Jose Antonio | Corrales, Juan Carlos

Article Type: Research Article

Abstract: The coffee rust is a devastating disease that causes large economic losses across the world. The severity of this disease changes over time so the farmers are not fully aware of the economic importance of the rust disease in the coffee crops. From a computational science perspective, several investigations have been proposed to decrease the effects caused by the coffee rust appearance from Expert systems based on machine learning techniques. However, because samples about coffee rust incidence are few, the rules created from machine learning techniques do not contain enough information to consider the diversity of scenarios for detecting coffee …rust. This paper proposes an expert system based on rules, where the rules are created considering the expert knowledge of specialists and technical reports about the behavior of the disease during a crop year. As far as we know, this is the first expert system proposed using not only expert knowledge but also technical reports in the coffee rust problem. The Buchanan methodology is used to design the proposed system. Experiment results present an average accuracy of 66,67% to detect a correct warning of coffee rust levels. Show more

Keywords: Decision support system, crops, disease, agriculture, hemileia vastatrix

DOI: 10.3233/JIFS-179025

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4765-4775, 2019

Price: EUR 27.50

In the pursuit of semantic similarity for literature on microbial transcriptional regulation

Authors: Lithgow-Serrano, Oscar | Collado-Vides, Julio

Article Type: Research Article

Abstract: The constant increase in the production of scientific literature is making it very difficult for experts to keep up to date with the state-of-the-art knowledge in their fields. The use of Natural Language Processing (NLP) is becoming a necessary aid to tackle this challenge. In the NLP field, the task of measuring semantic similarity between two sentences plays a vital role. It is a cornerstone for tasks like Q&A, Information Retrieval, Automatic Summarization, etc., and it is a crucial element in the ultimate goal of computers being able to decode what is conveyed in human language expression. Measuring Semantic …Similarity (SS) in short texts has specific challenges. Because there are fewer words to be compared, the meaning contribution of each word is more relevant, and it is important to take into account the syntax’s contribution to the composed meaning. In addition, the highly specific and specialized vocabulary — Microbial Transcriptional-Regulation—implies the lack of massive training resources. Our approach has been to use an ensemble of similarity metrics including string, distributional, and knowledge-based metric and to combine the results of such analyses. We have trained and tested these methods in a similarity corpus developed in-house. The task has proved very challenging, and the ensemble strategy has proved to be a good approach. Even though there is still much room for improvement in the precision of our methods concerning the human evaluation, we have managed to improve them reaching a strong correlation (ρ = 0.700). Show more

Keywords: Natural Language Processing, Semantic Textual Similarity

DOI: 10.3233/JIFS-179026

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4777-4786, 2019

Price: EUR 27.50

Generating image captions through multimodal embedding

Authors: Dash, Sandeep Kumar | Saha, Saurav | Pakray, Partha | Gelbukh, Alexander

Article Type: Research Article

Abstract: Caption generation requires best of both Computer Vision and Natural Language Processing. Due to recent improvements in both of them many efficient models have been developed. Automatic Image Captioning can be utilized to provide descriptions of website content or to engender frame-by-frame descriptions of video for the vision-impaired and in many such applications. In this work, a model is described which is utilized to generate novel image captions for a previously unseen image by utilizing a multimodal architecture by amalgamation of a Recurrent Neural Network (RNN) and a Convolutional Neural Network (CNN). The model is trained on Microsoft Common Objects …in Context (MSCOCO), an image captioning dataset that aligns captions and images in the same representation space, so that an image is close to its relevant captions in that space and far away from dissimilar captions and dissimilar images. ResNet-50 architecture is used for extracting features from the images and GloVe embeddings are used along with Gated Recurrent Unit (GRU) in Recurrent Neural Network (RNN) for text representation. MSCOCO evaluation server is used for evaluation of the machine generated caption for a given image. Show more

Keywords: Image captioning, convolutional neural network

DOI: 10.3233/JIFS-179027

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4787-4796, 2019

Price: EUR 27.50

Measuring interpretable semantic similarity of sentences using a multi chunk aligner

Authors: Majumder, Goutam | Pakray, Partha | Pinto, David

Article Type: Research Article

Abstract: This work focuses on bolstering the pre–existing Interpretable Semantic Textual Similarity (iSTS) method, that will enable a user to understand the behaviour of an artificial intelligent system. The proposed iSTS method explains the similarities and differences between a pair of sentences. The objective of the iSTS problem is to formalize the alignment between a pair of text segments and to label the relationship between the text fragments with a relation type and relatedness score. The overall objective of this work is to develop a 1:M multi chunk aligner for an iSTS method, which is trained on SemEval 2016 Task …2 dataset. The obtained result outperforms many state–of–art aligners, which were part of SemEval 2016 iSTS task. Show more

Keywords: WordNet, interpretability, semantic semilarity, Natural Language Processing, cosine similarity

DOI: 10.3233/JIFS-179028

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4797-4808, 2019

Price: EUR 27.50

Extraction of reordering rules for statistical machine translation

Authors: Srivastava, Jyoti | Sanyal, Sudip | Srivastava, Ashish Kumar

Article Type: Research Article

Abstract: Word reordering is an important problem for translation between languages which have different structures such as Subject-Verb-Object and Subject-Object-Verb. This paper presents a statistical method for extraction of linguistic rules using chunk to reorder the output of the baseline statistical machine translation system for improved performance. The experiments are based on the TDIL sample tourism corpus of English-Hindi language pair which consists of 1000 sentence pairs out of which 900 sentence pairs are used for training, 50 sentences for tuning and 50 sentences for testing. Finally, the output of the machine translation system, augmented by these rules, is evaluated by …using BLEU and NIST metrics. The BLEU score improves by more than 2% in comparison to the baseline SMT system. The results are compared with those of Google translation system which has been trained on a huge corpus. We got a 0.1 point improvement in terms of NIST score, in comparison to Google Translation. Thus, we have comparable results with such a small corpus of 900 sentence pairs for training. This paper is an effort to improve the performance of SMT with a small corpus by using linguistic rules where the rules are automatically generated instead of made by linguist. Show more

Keywords: Statistical machine translation, chunk, rule extraction, reordering rules, hybrid machine translation

DOI: 10.3233/JIFS-179029

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4809-4819, 2019

Price: EUR 27.50

Word sense induction in bengali using parallel corpora and distributional semantics

Authors: Sengupta, Saptarshi | Pandit, Rajat | Mitra, Parag | Naskar, Sudip Kumar | Sardar, Mohini Mohan

Article Type: Research Article

Abstract: One of the most challenging research problems in natural language processing (NLP) is that of word sense induction (WSI). It involves discovering senses of a word given its contexts of usage without the use of a sense inventory which differentiates it from traditional word sense disambiguation (WSD). This paper reports a work on sense induction in Bengali, a less-resourced language, based on distributional semantics and translation based context vectors learned from parallel corpora to improve the task performance. The performance of the proposed method of sense induction was compared with the k-means algorithm, which was considered as the baseline in …our work. A dataset for sense induction was created for 15 Bengali words, encompassing a total of 111 contexts. The proposed model, in both mono and cross-lingual settings, outperformed k-means in precision (P), recall (R) and F-scores. K-means based sense induction produced average P, R and F-scores of 0.71, 0.73 and 0.66 respectively. The average P, R and F-scores produced by the mono-and cross-lingual settings of the proposed algorithm are 0.77, 0.73, 0.68 and 0.81, 0.77 and 0.72 respectively. Show more

Keywords: Word sense induction (WSI), parallel corpora, translation, Word2Vec, context clustering

DOI: 10.3233/JIFS-179030

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4821-4832, 2019

Price: EUR 27.50

Author profiling for age and gender using combinations of features of various types

Authors: Ameer, Iqra | Sidorov, Grigori | Nawab, Rao Muhammad Adeel

Article Type: Research Article

Abstract: The process of automatic identification of an author’s demographic traits like gender, age, native language, geographical location, personality type and others from his/her written text is termed as author profiling (AP). Currently, it has engaged the research community due to its promising uses in security, marketing, forensic, bogus account identification on public networks. A variety of benchmark corpora (English text) released by PAN shared task is used to perform our experiments. This study presents a Content-based approach for detection of author’s traits (age group and gender) for same-genre author profiles. In our proposed method, we used a different set of …features including syntactic n-grams of part-of-speech tags, traditional n-grams of part-of-speech tags, the combination of word n-grams and combination of character n-grams. We tried a range of classifier for several profile sizes. We used the word uni-grams and character tri-grams as our baseline approaches. We achieved best accuracy of 0.496 and 0.734 for both traits, i.e., age group and gender respectively, by applying the combination of word n-grams of various sizes. Experimental results signify that the combination of word n-grams can produce good results on benchmark corpora. Show more

Keywords: Author profiling, machine learning, syntactic n-grams, traditional n-grams, part-of-epeech

DOI: 10.3233/JIFS-179031

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4833-4843, 2019

Price: EUR 27.50

A convolutional neural network approach for gender and language variety identification

Authors: Gómez-Adorno, Helena | Fuentes-Alba, Roddy | Markov, Ilia | Sidorov, Grigori | Gelbukh, Alexander

Article Type: Research Article

Abstract: We present a method for gender and language variety identification using a convolutional neural network (CNN). We compare the performance of this method with a traditional machine learning algorithm – support vector machines (SVM) trained on character n-grams (n = 3–8) and lexical features (unigrams and bigrams of words), and their combinations. We use a single multi-labeled corpus composed of news articles in different varieties of Spanish developed specifically for these tasks. We present a convolutional neural network trained on word- and sentence-level embeddings architecture that can be successfully applied to gender and language variety identification on a relatively small corpus …(less than 10,000 documents). Our experiments show that the deep learning approach outperforms a traditional machine learning approach on both tasks, when named entities are present in the corpus. However, when evaluating the performance of these approaches reducing all named entities to a single symbol “NE” to avoid topic-dependent features, the drop in accuracy is higher for the deep learning approach. Show more

Keywords: Convolutional neural networks, deep learning, author profiling, gender identification, language variety identification, machine learning, character n-grams, Spanish

DOI: 10.3233/JIFS-179032

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4845-4855, 2019

Price: EUR 27.50

A comparative analysis of distributional term representations for author profiling in social media

Authors: Álvarez-Carmona, Miguel A. | Villatoro-Tello, Esaú | Montes-Y-Gómez, Manuel | Villaseñor-Pineda, Luis

Article Type: Research Article

Abstract: Author Profiling (AP) aims at predicting specific characteristics from a group of authors by analyzing their written documents. Many research has been focused on determining suitable features for modeling writing patterns from authors. Reported results indicate that content-based features continue to be the most relevant and discriminant features for solving this task. Thus, in this paper, we present a thorough analysis regarding the appropriateness of different distributional term representations (DTR) for the AP task. In this regard, we introduce a novel framework for supervised AP using these representations and, supported on it. We approach a comparative analysis of representations such …as DOR, TCOR, SSR, and word2vec in the AP problem. We also compare the performance of the DTRs against classic approaches including popular topic-based methods. The obtained results indicate that DTRs are suitable for solving the AP task in social media domains as they achieve competitive results while providing meaningful interpretability. Show more

Keywords: Author profiling, document representation, distributional term representation, text classification, social media

DOI: 10.3233/JIFS-179033

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4857-4868, 2019

Price: EUR 27.50

Detection of fake news in a new corpus for the Spanish language

Authors: Posadas-Durán, Juan-Pablo | Gómez-Adorno, Helena | Sidorov, Grigori | Escobar, Jesús Jaime Moreno

Article Type: Research Article

Abstract: We present a new resource to analyze and detect deceptive information that is present in a huge amount of news websites. Specifically, we compiled a corpus of news in the Spanish language extracted from several websites. The corpus is annotated with two labels (real and fake) for automatic fake news detection. Furthermore, the corpus also provides the category of the news, presenting a detailed analysis on vocabulary overlap among categories. Finally, we present a style-based fake news detection method. The obtained results show that the introduced corpus is an interesting resource for future research in this area.

Keywords: Fake news, corpus, Spanish, resource, machine learning

DOI: 10.3233/JIFS-179034

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4869-4876, 2019

Price: EUR 27.50

Classification of opinions in cross domains involving emotive values

Authors: Guzmán-Cabrera, Rafael | Sánchez, Belém Priego | Mukhopadhyay, T. Prasad | García, J.M. Lozano | Cordova-Fraga, T.

Article Type: Research Article

Abstract: It is increasingly common for internet users to have access to blogs and social networks, and common for them to express opinions on such sites. This research work is framed within the scope of opinion mining. Opinions allow us to measure people’s perception of a specific topic or product. Knowing the opinion that a person has towards a product or service is of great help for decision making, since it allows, between other things, that potential consumers to verify the quality of the product or service before using it. This research work is framed within the scope of opinion mining. …When the number of opinions is very large the analysis gets more complicated and generally resort to tools that allow this task to be performed automatically are sought out. The present work performs an automatic categorization of textual opinions corresponding to four products: books, DVDs, kitchens, and electronics. Both negative and positive opinions are considered for the experiment. Further categorization experiments are performed using different domains of learning. The basic idea is to investigate if we can undertake classification of opinions, positive and negative, of any given domain using instances of training from a different domain. Results obtained from different methods of learning are presented. The results obtained allow us to examine the feasibility of the proposed methodology. Show more

Keywords: Cross Domain, emotive classification, opinion classification

DOI: 10.3233/JIFS-179035

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4877-4887, 2019

Price: EUR 27.50

Human interaction with shopping assistant robot in natural language

Authors: Sidorov, Grigori | Markov, Ilia | Kolesnikova, Olga | Chanona-Hernández, Liliana

Article Type: Research Article

Abstract: In spite of having been investigated for over fifty years, developing a robust spoken dialog management system remains an open research issue in robotics and natural language processing. In this paper, we present a language-independent spoken dialog management module integrated into a human-robot interaction system. We adopt an algorithmic approach to dialog modeling. A mobile robot functioning as a shopping assistant exemplifies the proposed approach. The dialog module is composed of a state transition network, in which state switches are conditioned by both visual and communicative factors. We use the formalism of a finite state automaton, where the robot changes …its state by performing a speech act or a non-verbal action from the set of specified act/action types. Show more

Keywords: Shopping assistant robot, spoken dialog management, speech acts, state transition network, finite automaton, visual factors, communicative factors

DOI: 10.3233/JIFS-179036

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4889-4899, 2019

Price: EUR 27.50

Geographical aggregation of microblog posts for LDA topic modeling

Authors: López-Ramírez, Pablo | Molina-Villegas, Alejandro | Siordia, Oscar S.

Article Type: Research Article

Abstract: In this paper we propose an aggregation strategy for geolocated Twitter posts based on a hierarchical definition of the regular activity patterns within a specific region. The aggregation yields a series of documents that are used to train a topic model. The resulting model is tested against the ones produced by two other aggregation strategies proposed in the literature: aggregation by user and by hashtag . For comparison, we use quality metrics widely used on the literature. The results show that the Geographical Aggregation performs similarly to hashtag aggregation in terms of Jensen-Shannon Divergence and outperforms other aggregation schemes …in its ability to reproduce the original cluster labels. One potential application behind this is the discovery of unusual events or as a basis for geolocating messages from text. Show more

Keywords: Probabilistic topic modeling, geolocation, social network

DOI: 10.3233/JIFS-179037

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4901-4908, 2019

Price: EUR 27.50

Short-answer grading using textual entailment

Authors: Basak, Rohini | Naskar, Sudip Kumar | Gelbukh, Alexander

Article Type: Research Article

Abstract: Given a question, a reference answer, and the answer given by the student, the aim of the automatic short answer grading task is to assign a grade to the student’s answer. We use for this a large number of matching rules relying on recognizing entailment relation between dependency structures of the two answers. Comparison of the grades generated by our method with those given by human judges on a computer science dataset shows a quite promising maximum correlation of 0.627.

Keywords: Automatic short answer grading, recognizing textual entailment, dependency parsing, semantic similarity

DOI: 10.3233/JIFS-179038

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4909-4919, 2019

Price: EUR 27.50

Low-resource neural character-based noisy text normalization

Authors: Mager, Manuel | Rosales, Mónica Jasso | Çetinoğlu, Özlem | Meza, Ivan

Article Type: Research Article

Abstract: User generated data in social networks is often not written in its standard form. This kind of text can lead to large dispersion in the datasets and can lead to inconsistent data. Therefore, normalization of such kind of texts is a crucial preprocessing step for common Natural Language Processing tools. In this paper we explore the state-of-the-art of the machine translation approach to normalize text under low-resource conditions. We also propose an auxiliary task for the sequence-to-sequence (seq2seq) neural architecture novel to the text normalization task, that improves the base seq2seq model up to 5%. This increase of performance closes …the gap between statistical machine translation approaches and neural ones for low-resource text normalization. Show more

Keywords: Noisy text, normalization, recurrent neural networks, low-resource, autoencoding

DOI: 10.3233/JIFS-179039

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4921-4929, 2019

Price: EUR 27.50

Frequent similar pattern mining using non Boolean similarity functions

Authors: Rodríguez-González, Ansel Y. | Martínez-Trinidad, José F. | Carrasco-Ochoa, Jesús A. | Ruiz-Shulcloper, José | Alvarado-Mentado, Matías

Article Type: Research Article

Abstract: There are many problems were the objects under study are described by mixed data (numerical and non numerical features) and similarity functions different from the exact matching are usually employed to compare them. Some algorithms for mining frequent patterns allow the use of Boolean similarity functions different from exact matching. However, they do not allow the use of non Boolean similarity functions. Transforming a non Boolean similarity function into a Boolean one, and then applying the previous algorithms for mining frequent patterns, could lead to loss some patterns, and even more to generate some other patterns which indeed should not …be considered as frequent similar patterns. In this paper, we extend the similar frequent pattern mining by allowing the use of non Boolean similarity functions. Several properties for pruning the search space of frequent similar patterns and a data structure that allows computing the frequency of patterns candidates, are proposed. Also, three algorithms for mining frequent patterns using non Boolean similarity functions are proposed. Experimental results show the efficiency and efficacy of the algorithms. The proposed algorithms obtain better patterns for classification than those patterns obtained by traditional frequent pattern miners, and miners using Boolean similarity functions. Show more

Keywords: Data mining, frequent patterns, similarity functions, Mixed data

DOI: 10.3233/JIFS-179040

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4931-4944, 2019

Price: EUR 27.50

Deterministic oversampling methods based on SMOTE

Authors: Rodriguez-Torres, Fredy | Carrasco-Ochoa, Jesús A. | Martínez-Trinidad, José Fco.

Article Type: Research Article

Abstract: In supervised classification if one of the classes has fewer objects than the other, we have a class imbalance problem. One of the most common solutions to address class imbalance problems is oversampling, and SMOTE is the most referenced and well-known oversampling method. However, SMOTE creates synthetic objects in a random way, therefore it produces a different result each time it is applied, and in practice the user has to apply SMOTE several times for choosing the best of all the generated balanced datasets. For this reason, in this paper, we present SMOTE-D, a deterministic version of SMOTE, and propose …new deterministic SMOTE-D-based versions of some of the most recent and successful SMOTE-based methods. In our experiments, we show that all proposed deterministic methods produce as good results as random methods but our proposals need to be applied just once. This is very important from a practical point of view since our proposals save time by avoiding multiple applications of them as SMOTE does and they provide one unique result. Show more

Keywords: Imbalanced datasets, oversampling, supervised classification

DOI: 10.3233/JIFS-179041

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4945-4955, 2019

Price: EUR 27.50

Cellular Estimation Gaussian Algorithm for Continuous Domain

Authors: Martínez-López, Yoan | Madera, Julio | Rodríguez-González, Ansel Y. | Barigye, Stephen

Article Type: Research Article

Abstract: Optimization algorithms are important in problems of pattern recognition and artificial intelligence, i.e., the image recognition, face recognition, data analysis, optical recognition, etc. Estimation distribution algorithms (EDAs ) is kind of optimization algorithms based on substituting the crossover and mutation operators of the Genetic Algorithms by the estimation and later sampling the probability distribution learned from the selected individuals. However, a weakness of these algorithms is the efficiency in terms of the number of evaluations of the fitness function. In this paper, a Cellular Gaussian Estimation Algorithm (CEGA ) for solving continuous optimization problems is proposed. CEGA is derived …from evidence-based learning of independence and decentralized schemes of local populations. The experimental results showed that the present proposal reduces the number of evaluations of the fitness function in the search for optimums, maintaining its effectiveness in comparison to other algorithms of state-of-art using the same benchmark of continuous functions. Show more

Keywords: Cellular EDA, learning, probabilistic graph model, Gaussian networks

DOI: 10.3233/JIFS-179042

Citation: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 5, pp. 4957-4967, 2019

Price: EUR 27.50

Display: 10 | 50 | 100 items per page

Journal of Intelligent & Fuzzy Systems - Volume 36, issue 5

North America

Europe

Asia