Mutual information maximization and feature space separation and bi-bimodal mo-dality fusion for multimodal sentiment analysis

Li, Kun; Tian, Shengwei; Yu, Long; Zhou, Tiejun; Wang, Bo; Wang, Fun

doi:10.3233/JIFS-222189

Mutual information maximization and feature space separation and bi-bimodal mo-dality fusion for multimodal sentiment analysis

Article type: Research Article

Authors: Li, Kun^a | Tian, Shengwei^{a; *} | Yu, Long^b | Zhou, Tiejun^c | Wang, Bo^a | Wang, Fun^a

Affiliations: [a] School of Software, University of Xinjiang, Xinjiang, China | [b] Network and Information Center, University of Xinjiang, Xinjiang, China | [c] Internet Information Security Centre, Xinjiang, China

Correspondence: [*] Corresponding author. Shengwei Tian, School of Software, University of Xinjiang, Xinjiang, China. E-mail: tianshengwei@163.com.

Abstract: In recent years multimodal sentiment analysis (MSA) has been devoted to developing effective fusion mechanisms and has made advances, however, there are several challenges that have not been addressed adequately: the models make insufficient use of important information (inter-modal relevance and independence information) resulting in additional noise, and the traditional ternary symmetric architecture cannot well solve the problem of uneven distribution of task-related information among modalities. Thus, we propose Mutual Information Maximization and Feature Space Separation and Bi-Bimodal Modality Fusion (MFSBF)framework which effectively alleviates these problems. To alleviate the problem of underutilization of important information among modalities, a mutual information maximization module and a feature space separation module have been designed. The mutual information module maximizes the mutual information between two modalities to retain more relevance (modality-invariant) information, while the feature separation module separates fusion features to prevent the loss of independence(modality-specific) information during the fusion process. As different modalities contribute differently to the model, a bimodal fusion architecture is used, which involves the fusion of two bimodal pairs. The architecture focuses more on the modality that contains more task-ralated information and alleviates the problem of uneven distribution of useful information among modalities. The experiment results of our model on two publicly available datasets (CUM-MOSI and CUM-MOSEI) achieved better or comparable results than previous models, which demonstrate the efficacy of our method.

Keywords: Multimodal sentiment analysis, mutual information, feature separation, modality fusion

DOI: 10.3233/JIFS-222189

Journal: Journal of Intelligent & Fuzzy Systems, vol. 45, no. 4, pp. 5783-5793, 2023

Published: 04 October 2023

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl

Share this:

North America

Europe

Asia