Variable-length sequence model for attribute detection in the image

Li, Xin; Gu, Jiaming; Lu, Xiaoyuan; Ning, Yan; Zhang, Liang; Shen, Peiyi; Gu, Chaochen

doi:10.3233/JCM-226762

Variable-length sequence model for attribute detection in the image

Article type: Research Article

Authors: Li, Xin^a | Gu, Jiaming^b | Lu, Xiaoyuan^{a; *} | Ning, Yan^b | Zhang, Liang^c | Shen, Peiyi^c | Gu, Chaochen^d

Affiliations: [a] Southeast University, NanJing, Jiangsu, China | [b] National Engineering Research Center for Broadband Networks and Applications, Shanghai, Shanghai, China | [c] Xidian University, Xi’an, Shanxi, China | [d] Shanghai Jiao Tong University, Shanghai, Shanghai, China

Correspondence: [*] Corresponding author: Xiaoyuan Lu, Southeast University, NanJing, Jiangsu, China. E-mail: xylu@bnc.org.cn.

Abstract: Holistic scene understanding is a challenging problem in computer vision. Most recent researches in this field were focusing on the object detection, the semantic segmentation and the relationship detection tasks. The attribute can provide meaningful information for the object instance, thus the object instance can be expressed more detail in the scene understanding. However, most researches in this field have been limited to several special conditions. Such as, several researches were just focusing on the attribute of special object class, because their solutions were aimed at a limited-scenarios, their methods are hardly to generalize in other scenarios. We also find that most of the research for multi-attribute detection task were only regarding each attribute as binary class and simply use the multi-binary-classifier method for the attribute detection. But these strategies above not consider the relation between each pair of the attributes, they will fall into trouble in the “imperfect” attribute dataset (which is labeled with the missing and incomplete annotations), and they will have low performance in the long-tail attribute class (which has lower rank of annotation and more missing labels). In this paper, we focus on the multi-attribute detection for a variant of object classes and take the relation between attributes into consideration. We propose a GRU-based model to detect a variable-length attribute sequence with a customized loss compute method to solve the “imperfect” attribute dataset problem. Furthermore, we perform ablative studies to prove the effectiveness of each part of our method. Finally, we compare our model with several existed multi-attribute detection methods on VG (Visual Genome) and CUB200 bird datasets to prove the superior performance of the proposed model.

Keywords: Attribute detection, scene understanding, variable-length attribute detection, VADN

DOI: 10.3233/JCM-226762

Journal: Journal of Computational Methods in Sciences and Engineering, vol. 23, no. 4, pp. 1913-1927, 2023

Published: 18 August 2023

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl

Share this:

North America

Europe

Asia