You are viewing a javascript disabled version of the site. Please enable Javascript for this site to function properly.
Go to headerGo to navigationGo to searchGo to contentsGo to footer
In content section. Select this link to jump to navigation

Editorial

Dear Colleague:Welcome to the 2020 special issue of Intelligent Data Analysis (IDA) Journal.

This special issue of the IDA journal consists of ten best papers containing some of the evolving theoretical and applied research related to the field of intelligent systems and pattern recognition. These papers have undergone the strict peer-review process by the Conference Technical Program Committee and reviewers of CIARP 2019 conference that was held in Havana, Cuba, last October.

The first paper by Bruzon et al. is about Multilevel Term Analysis for Adaptive Document Filtering. The authors argue that information organization in documents in a logical and intentional way is not usually exploited by the filtering methods for the construction of a user profile. The authors propose the use of term relations considering different context levels for enhancing document filtering. Their experiments allowed to assess the impact on the filtering task of the proposed representation.

The second paper by Bugueño and Mendoza is about learning to combine classifiers outputs with the transformer for text classification. The authors propose a scheme that combines the outputs of different classifiers, coding them in the encoder of a transformer. These encodings are used to train a new text classifier which allows the representation learning task to be driven without over-fitting the encoding to a particular class. Their experiments demonstrate that the combination of both methods, representation learning, and data augmentation, allows improving the performance of trained classifiers.

The third paper is by Pérez-Guadarramas et al. is also about text processing in which the authors present an unsupervised method for keyphrase extraction, based on the use of lexico-syntactic patterns for extracting information from texts and a fuzzy topic modeling. They evaluate their proposed approach with Inspec and 500N-KPCrowd datasets. They also perform a statistical analysis to substantiate the best approach and compare it with other reported systems, giving promising results.

The fourth paper by Mena etal.is about collective annotation patterns in learning from Crowds. The authors argue that the lack of annotated data is one of the major barriers facing machine learning applications. The authors present two models to address this problem where both methods are based on the hypothesis that it is possible to learn collective annotation patterns by introducing confusion matrices that involve groups of data point annotations or annotators. Their experimental results show that, compared with other methods for learning from crowds, both methods have advantages in scenarios with a large number of annotators and a small number of annotations per annotator.

In the fifth paper of this special issue Camacho et al. present a function optimization algorithm that is a combination of a first-order gradient-based optimizer of stochastic functions, known as the Adam algorithm and the Kalman filter. The idea is to filter each parameter of the objective function using a 1-D Kalman filter which allows to switch from matrix and vector calculations to scalar operations. The idea proposed is well suited for problems with large datasets and/or parameters, non-stationary objectives, noisy and/or sparse gradients.

The sixth paper by González-Méndez et al. is about evaluating pattern restrictions for associative classifiers. The authors introduce an experimental comparison of the impact of using different restrictions in the classification accuracy and find that their conclusions could be unintentionally biased by the restrictions they used. Their investigation opens some interesting lines of research, mainly in the creation of new restrictions and new pattern types by joining different restrictions.

The seventh paper by Prado-Romero et al. introduces a time-sensitive model to predict topic popularity in news providers. The authors argue that since the volume of news increases every day, this triggers competition for users’ attention. Therefore, predicting which topics will become trendy has many applications in domains such as marketing or politics. The authors propose a model for representing topic popularity behavior across time and to predict if a topic will become trendy in the future. Their approach is tested on a real data set from Yahoo News where their experiments confirmed the validity of their proposed model.

In the eighth paper of this special issue Mena et al. present an interpretable and effective hashing via Bernoulli variational auto-encoders. The authors argue that due to the rapid increase in the amount of data generated in many fields of science and engineering, information retrieval methods tailored to large-scale datasets have become increasingly important. They explain that semantic hashing is an emerging technique for this purpose that works on the idea of representing complex data objects, like images and text. They present an approach that is based on Bernoulli’s latent variables in both the training and the prediction stage of learning. The authors conclude that minding this gap in the design of the auto-encoder can translate into more accurate retrieval results.

Bello et al. in the ninth paper which is bout generation of multi-label prototypes argue that data reduction techniques play a key role in instance-based classification to reduce the amount of data to be processed. Prototype generation that aims to obtain a reduced training set translates into a significant reduction in both algorithms’ spatial and temporal burden. This issue is particularly relevant in multi-label classification, which is a generalization of multiclass classification that allows objects to belong to several classes. Their simulations show that these methods significantly reduce the number of examples to a set of prototypes without significantly affecting classifiers’ performance.

And finally, Serpell et al. in the last paper address model uncertainty in probabilistic forecasting using Monte Carlo dropout. The authors argue that deep learning models have been developed to address probabilistic forecasting tasks, assuming an implicit stochastic process that relates past observed values to uncertain future values. Apparently, these models are capable of capturing the inherent uncertainty of the underlying process, but they ignore the model uncertainty that comes from the fact of not having infinite data. Their paper proposes addressing the model uncertainty problem using Monte Carlo dropout, a variational approach that assigns distributions to the weights of a neural network instead of simply using fixed values. Their proposal is validated for prediction intervals estimation on seven energy time series, using a popular probabilistic model called Mean Variance Estimation (MVE), as the deep model adapted using the technique.

We hope this special issue can provide you with some interesting ideas about the recent advances in Intelligent Pattern Recognition Systems associated with the field of Intelligent Data Analysis. We are grateful to all the authors for preparing the extended version of their papers along with the referees and Conference Organizing Committee CIARP-2019 for their efforts.

With best regards

Dr. Jose E. Medina Pagola and Dr. A. Famili

Guest Editors