A comparative analysis of euphemistic sentences in news using feature weight scheme and intelligent techniques

Seethappan, K.; Premalatha, K.

doi:10.3233/JIFS-211295

A comparative analysis of euphemistic sentences in news using feature weight scheme and intelligent techniques

Article type: Research Article

Authors: Seethappan, K.^{a; *} | Premalatha, K.^b

Affiliations: [a] Department of Computer Science and Engineering, University College of Engineering, Ramanathapuram, Tamilnadu, India | [b] Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamangalam, Tamilnadu, India

Correspondence: [*] Corresponding author. K. Seethappan, Assistant Professor, Department of Computer Science and Engineering, University College of Engineering, Ramanathapuram, Tamilnadu, India-623 513. E-mail: seethappan@gmail.com.

Abstract: Although there have been various researches in the detection of different figurative language, there is no single work in the automatic classification of euphemisms. Our primary work is to present a system for the automatic classification of euphemistic phrases in a document. In this research, a large dataset consisting of 100,000 sentences is collected from different resources for identifying euphemism or non-euphemism utterances. In this work, several approaches are focused to improve the euphemism classification: 1. A Combination of lexical n-gram features 2.Three Feature-weighting schemes 3.Deep learning classification algorithms. In this paper, four machine learning (J48, Random Forest, Multinomial Naïve Bayes, and SVM) and three deep learning algorithms (Multilayer Perceptron, Convolutional Neural Network, and Long Short-Term Memory) are investigated with various combinations of features and feature weighting schemes to classify the sentences. According to our experiments, Convolutional Neural Network (CNN) achieves precision 95.43%, recall 95.06%, F-Score 95.25%, accuracy 95.26%, and Kappa 0.905 by using a combination of unigram and bigram features with TF-IDF feature weighting scheme in the classification of euphemism. These results of experiments show CNN with a strong combination of unigram and bigram features set with TF-IDF feature weighting scheme outperforms another six classification algorithms in detecting the euphemisms in our dataset.

Keywords: Euphemism, TF-IDF, n-gram, Support Vector Machine, CNN

DOI: 10.3233/JIFS-211295

Journal: Journal of Intelligent & Fuzzy Systems, vol. 42, no. 3, pp. 1937-1948, 2022

Published: 02 February 2022

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl

Share this:

North America

Europe

Asia