Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Cui, Weia; b | Zhang, Xueruib; * | Shang, Mingshengb; *
Affiliations: [a] College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China | [b] Chongqing Key Laboratory of Big Data and Intelligent Computing, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing, China
Correspondence: [*] Corresponding author. Mingsheng Shang and Xuerui Zhang, Chongqing Key Laboratory of Big Data and Intelligent Computing, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing, 400714, China. E-mails: msshang@cigit.ac.cn (Mingsheng Shang) and zxr@cigit.ac.cn (Xuerui Zhang)
Abstract: An increasing number of fake news combining text, images and other forms of multimedia are spreading rapidly across social platforms, leading to misinformation and negative impacts. Therefore, the automatic identification of multimodal fake news has become an important research hotspot in academia and industry. The key to multimedia fake news detection is to accurately extract features of both text and visual information, as well as to mine the correlation between them. However, most of the existing methods merely fuse the features of different modal information without fully extracting intra- and inter-modal connections and complementary information. In this work, we learn physical tampered cues for images in the frequency domain to supplement information in the image space domain, and propose a novel multimodal frequency-aware cross-attention network (MFCAN) that fuses the representations of text and image by jointly modelling intra- and inter-modal relationships between text and visual information whin a unified deep framework. In addition, we devise a new cross-modal fusion block based on the cross-attention mechanism that can leverage inter-modal relationships as well as intra-modal relationships to complement and enhance the features matching of text and image for fake news detection. We evaluated our approach on two publicly available datasets and the experimental results show that our proposed model outperforms existing baseline methods.
Keywords: Fake news detection, multimoal, cross attention, frequency domain
DOI: 10.3233/JIFS-233193
Journal: Journal of Intelligent & Fuzzy Systems, vol. 46, no. 1, pp. 433-455, 2024
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl