Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Li, Xina | Gu, Jiamingb | Lu, Xiaoyuana; * | Ning, Yanb | Zhang, Liangc | Shen, Peiyic | Gu, Chaochend
Affiliations: [a] Southeast University, NanJing, Jiangsu, China | [b] National Engineering Research Center for Broadband Networks and Applications, Shanghai, Shanghai, China | [c] Xidian University, Xi’an, Shanxi, China | [d] Shanghai Jiao Tong University, Shanghai, Shanghai, China
Correspondence: [*] Corresponding author: Xiaoyuan Lu, Southeast University, NanJing, Jiangsu, China. E-mail: xylu@bnc.org.cn.
Abstract: Holistic scene understanding is a challenging problem in computer vision. Most recent researches in this field were focusing on the object detection, the semantic segmentation and the relationship detection tasks. The attribute can provide meaningful information for the object instance, thus the object instance can be expressed more detail in the scene understanding. However, most researches in this field have been limited to several special conditions. Such as, several researches were just focusing on the attribute of special object class, because their solutions were aimed at a limited-scenarios, their methods are hardly to generalize in other scenarios. We also find that most of the research for multi-attribute detection task were only regarding each attribute as binary class and simply use the multi-binary-classifier method for the attribute detection. But these strategies above not consider the relation between each pair of the attributes, they will fall into trouble in the “imperfect” attribute dataset (which is labeled with the missing and incomplete annotations), and they will have low performance in the long-tail attribute class (which has lower rank of annotation and more missing labels). In this paper, we focus on the multi-attribute detection for a variant of object classes and take the relation between attributes into consideration. We propose a GRU-based model to detect a variable-length attribute sequence with a customized loss compute method to solve the “imperfect” attribute dataset problem. Furthermore, we perform ablative studies to prove the effectiveness of each part of our method. Finally, we compare our model with several existed multi-attribute detection methods on VG (Visual Genome) and CUB200 bird datasets to prove the superior performance of the proposed model.
Keywords: Attribute detection, scene understanding, variable-length attribute detection, VADN
DOI: 10.3233/JCM-226762
Journal: Journal of Computational Methods in Sciences and Engineering, vol. 23, no. 4, pp. 1913-1927, 2023
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl