Mining discriminative patches for script identification in natural scene images

Lu, Liqiong; Wu, Dong; Tang, Ziwei; Yi, Yaohua; Huang, Faliang

doi:10.3233/JIFS-200260

Mining discriminative patches for script identification in natural scene images

Article type: Research Article

Authors: Lu, Liqiong^{a; b} | Wu, Dong^a | Tang, Ziwei^b | Yi, Yaohua^{b; *} | Huang, Faliang^{c; *}

Affiliations: [a] Department of Information Engineering, Lingnan Normal University, Zhanjiang, P.R. China | [b] School of Printing and Packaging, Wuhan University, Wuhan, P.R. China | [c] School of Computer and Information Engineering, Nanning Normal University, Nanning, P.R. China

Correspondence: [*] Corresponding authors: Yaohua Yi, School of Printing and Packaging, Wuhan University, Wuhan, P.R. China. E-mail: whudcil@whu.edu.cn and Faliang Huang, School of Computer and Information Engineering, Nanning Normal University, Nanning, P.R. China. E-mail: faliang.huang@gmail.com

Abstract: This paper focuses on script identification in natural scene images. Traditional CNNs (Convolution Neural Networks) cannot solve this problem perfectly for two reasons: one is the arbitrary aspect ratios of scene images which bring much difficulty to traditional CNNs with a fixed size image as the input. And the other is that some scripts with minor differences are easily confused because they share a subset of characters with the same shapes. We propose a novel approach combing Score CNN, Attention CNN and patches. Attention CNN is utilized to determine whether a patch is a discriminative patch and calculate the contribution weight of the discriminative patch to script identification of the whole image. Score CNN uses a discriminative patch as input and predict the score of each script type. Firstly patches with the same size are extracted from the scene images. Secondly these patches are used as inputs to Score CNN and Attention CNN to train two patch-level classifiers. Finally, the results of multiple discriminative patches extracted from the same image via the above two classifiers are fused to obtain the script type of this image. Using patches with the same size as inputs to CNN can avoid the problems caused by arbitrary aspect ratios of scene images. The trained classifiers can mine discriminative patches to accurately identify some confusing scripts. The experimental results show the good performance of our approach on four public datasets.

Keywords: Script identification, score CNN, attention CNN, discriminative patches, scene images

DOI: 10.3233/JIFS-200260

Journal: Journal of Intelligent & Fuzzy Systems, vol. 40, no. 1, pp. 551-563, 2021

Published: 04 January 2021

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl

Share this:

North America

Europe

Asia