AI Communications - Volume 36, issue 4 - Journals

Classifying falls using out-of-distribution detection in human activity recognition

Authors: Roy, Debaditya | Komini, Vangjush | Girdzijauskas, Sarunas

Article Type: Research Article

Abstract: As the research community focuses on improving the reliability of deep learning, identifying out-of-distribution (OOD) data has become crucial. Detecting OOD inputs during test/prediction allows the model to account for discriminative features unknown to the model. This capability increases the model’s reliability since this model provides a class prediction solely at incoming data similar to the training one. Although OOD detection is well-established in computer vision, it is relatively unexplored in other areas, like time series-based human activity recognition (HAR). Since uncertainty has been a critical driver for OOD in vision-based models, the same component has proven effective in time-series …applications. In this work, we propose an ensemble-based temporal learning framework to address the OOD detection problem in HAR with time-series data. First, we define different types of OOD for HAR that arise from realistic scenarios. Then we apply our ensemble-based temporal learning framework incorporating uncertainty to detect OODs for the defined HAR workloads. This particular formulation also allows a novel approach to fall detection. We train our model on non-fall activities and detect falls as OOD. Our method shows state-of-the-art performance in a fall detection task using much lesser data. Furthermore, the ensemble framework outperformed the traditional deep-learning method (our baseline) on the OOD detection task across all the other chosen datasets. Show more

Keywords: Out-of-distribution detection, uncertainty estimation, human activity recognition, deep learning, time-series classification

DOI: 10.3233/AIC-220205

Citation: AI Communications, vol. 36, no. 4, pp. 251-267, 2023

Get PDF

Fully Automated Neural Network Framework for Pulmonary Nodules Detection and Segmentation

Authors: Xiong, Yixin | Zhou, Yongcheng | Wang, Yujuan | Liu, Quanxing | Deng, Lei

Article Type: Research Article

Abstract: Lung cancer is the leading cause of cancer death worldwide, and most patients are diagnosed with advanced stages for lack of symptoms in the early stages of the disease, leading to poor prognosis. It is thus of great importance to detect lung cancer in the early stages which can reduce mortality and improve patient survival significantly. Although there are many computer aided diagnosis (CAD) systems used for detecting pulmonary nodules, there are still few CAD systems for detection and segmentation, and their performance on small nodules is not ideal. Thus, in this paper, we propose a deep cascaded multitask framework …called mobilenet split-attention Yolo unet, the mobilenet split-attention Yolo(Msa-yolo) greatly enhance the feature of small nodules and boost up their performance, the overall result shows that the mean accuracy precision (mAP) of our Msa-Yolo compared to Yolox has increased from 85.10% to 86.64% on LUNA16 dataset, and from 90.13% to 94.15% on LCS dataset compared to YoloX. Besides, we get only 8.35 average number of candidates per scan with 96.32% sensitivity on LUNA16 dataset, which greatly outperforms other existing systems. At the segmentation stage, the mean intersection over union (mIOU) of our CAD system has increased from 71.66% to 76.84% on LCS dataset comparing to baseline. Conclusion: A fast, accurate and robust CAD system for nodule detection, segmentation and classification is proposed in this paper. And it is confirmed by the experimental results that the proposed system possesses the ability to detect and segment small nodules. Show more

Keywords: Pulmonary nodule, deep learning, medical segmentation

DOI: 10.3233/AIC-220318

Citation: AI Communications, vol. 36, no. 4, pp. 269-284, 2023

Price: EUR 27.50

DW: Detected weight for 3D object detection

Authors: Huang, Zhi

Article Type: Research Article

Abstract: It is a generic paradigm to treat all samples equally in 3D object detection. Although some works focus on discriminating samples in the training process of object detectors, the issue of whether a sample detects its target GT (Ground Truth) during training process has never been studied. In this work, we first point out that discriminating the samples that detect their target GT and the samples that don’t detect their target GT is beneficial to improve the performance measured in terms of mAP (mean Average Precision). Then we propose a novel approach name as DW (Detected Weight). The proposed approach …dynamically calculates and assigns different weights to detected and undetected samples, which suppresses the former and promotes the latter. The approach is simple, low-calculation and can be integrated with available weight approaches. Further, it can be applied to almost 3D detectors, even 2D detectors because it is nothing to do with network structures. We evaluate the proposed approach with six state-of-the-art 3D detectors on two datasets. The experiment results show that the proposed approach improves mAP significantly. Show more

Keywords: Object detection, 3D, weight

DOI: 10.3233/AIC-230008

Citation: AI Communications, vol. 36, no. 4, pp. 285-295, 2023

Price: EUR 27.50

Multi-scale spatio-temporal network for skeleton-based gait recognition

Article Type: Research Article

Abstract: Gait has unique physiological characteristics and supports long-distance recognition, so gait recognition is ideal for areas such as home security and identity detection. Methods using graph convolutional networks usually extract features in the spatial and temporal dimensions by stacking GCNs and TCNs, but different joints are interconnected at different moments, so splitting the spatial and temporal dimensions can cause the loss of gait information. Focus on this problem, we propose a gait recognition network, Multi-scale Spatio-Temporal Gait (MST-Gait), which can learn multi-scale gait information simultaneously from spatial and temporal dimensions. We design a multi-scale spatio-temporal groups Transformer (MSTGT) to model …the correlation of intra-frame and inter-frame joints simultaneously. And a multi-scale segmentation strategy is designed to capture the periodic and local features of the gait. To fully exploit the temporal information of gait motion, we design a fusion temporal convolution (FTC) to aggregate temporal information at different scales and motion information. Experiments on the popular CASIA-B gait dataset and OUMVLP-Pose dataset show that our method outperforms most existing skeleton-based methods, verifying the effectiveness of the proposed modules. Show more

Keywords: Gait recognition, graph convolution, self-attention

DOI: 10.3233/AIC-230033

Citation: AI Communications, vol. 36, no. 4, pp. 297-310, 2023

Price: EUR 27.50

Dynamic finegrained structured pruning sensitive to filter texture distribution

Authors: Li, Ping | Wang, Yuzhe | Wu, Cong | Kang, Xiatao

Article Type: Research Article

Abstract: Pruning of neural networks is undoubtedly a popular approach to cope with the current compression of large-scale, high-cost network models. However, most of the existing methods require a high level of human-regulated pruning criteria, which requires a lot of human effort to figure out a reasonable pruning strength. One of the main reasons is that there are different levels of sensitivity distribution in the network. Our main goal is to discover compression methods that adapt to this distribution to avoid deep architectural damage to the network due to unnecessary pruning. In this paper, we propose a filter texture distribution that …affects the training of the network. We also analyze the sensitivity of each of the diverse states of this distribution. To do so, we first use a multidimensional penalty method that can analyze the potential sensitivity based on this texture distribution to obtain a pruning-friendly sparse environment. Then, we set up a lightweight dynamic threshold container in order to prune the sparse network. By providing each filter with a suitable threshold for that filter at a low cost, a massive reduction in the number of parameters is achieved without affecting the contribution of certain pruning-sensitive layers to the network as a whole. In the final experiments, our two methods adapted to texture distribution were applied to ResNet Deep Neural Network (DNN) and VGG-16, which were deployed on the classical CIFAR-10/100 and ImageNet datasets with excellent results in order to facilitate comparison with good cutting-edge pruning methods. Code is available at https://github.com/wangyuzhe27/CDP-and-DTC . Show more

Keywords: Threshold, neural network, penalty, pruning

DOI: 10.3233/AIC-230046

Citation: AI Communications, vol. 36, no. 4, pp. 311-323, 2023

Price: EUR 27.50

TMTrans: texture mixed transformers for medical image segmentation

Authors: Chen, Lifang | Wang, Tao | Ge, Hongze

Article Type: Research Article

Abstract: Accurate segmentation of skin cancer is crucial for doctors to identify and treat lesions. Researchers are increasingly using auxiliary modules with Transformers to optimize the model’s ability to process global context information and reduce detail loss. Additionally, diseased skin texture differs from normal skin, and pre-processed texture images can reflect the shape and edge information of the diseased area. We propose TMTrans (Texture Mixed Transformers). We have innovatively designed a dual axis attention mechanism (IEDA-Trans) that considers both global context and local information, as well as a multi-scale fusion (MSF) module that associates surface shape information with deep semantics. Additionally, …we utilize TE(Texture Enhance) and SK(Skip connection) modules to bridge the semantic gap between encoders and decoders and enhance texture features. Our model was evaluated on multiple skin datasets, including ISIC 2016/2017/2018 and PH2 , and outperformed other convolution and Transformer-based models. Furthermore, we conducted a generalization test on the 2018 DSB dataset, which resulted in a nearly 2% improvement in the Dice index, demonstrating the effectiveness of our proposed model. Show more

Keywords: U-Net, texture, transformer, skin lesion, medical image segmentation

DOI: 10.3233/AIC-230089

Citation: AI Communications, vol. 36, no. 4, pp. 325-340, 2023

Price: EUR 27.50

Dual cross-domain session-based recommendation with multi-channel integration

Authors: Zhang, Jinjin | Hua, Xiang | Zhao, Peng | Kang, Kai

Article Type: Research Article

Abstract: Session-based recommendation aims at predicting the next behavior when the current interaction sequence is given. Recent advances evaluate the effectiveness of dual cross-domain information for the session-based recommendation. However, we discover that accurately modeling the session representations is still a challenging problem due to the complexity of preference interactions in the cross-domain, and various methods are proposed to only model the common features of cross-domain, while ignoring the specific features and enhanced features for the dual cross-domain. Without modeling the complete features, the existing methods suffer from poor recommendation accuracy. Therefore, we propose an end-to-end dual cross-domain with multi-channel interaction …model (DCMI), which utilizes dual cross-domain session information and multiple preference interaction encoders, for session-based recommendation. In DCMI, we apply a graph neural network to generate the session global preference and local preference. Then, we design a cross-preference interaction module to capture the common, specific, and enhanced features for cross-domain sessions with local preferences and global preferences. Finally, we combine multiple preferences with a bilinear fusion mechanism to characterize and make recommendations. Experimental results on the Amazon dataset demonstrate the superiority of the DCMI model over the state-of-the-art methods. Show more

Keywords: Session-based recommendation, dual cross-domain, cross-preference interaction module, bilinear fusion mechanism

DOI: 10.3233/AIC-230084

Citation: AI Communications, vol. 36, no. 4, pp. 341-359, 2023

Price: EUR 27.50

Conflagration-YOLO: a lightweight object detection architecture for conflagration

Article Type: Research Article

Abstract: Fire monitoring of fire-prone areas is essential, and in order to meet the requirements of edge deployment and the balance of fire recognition accuracy and speed, we design a lightweight fire recognition network, Conflagration-YOLO. Conflagration-YOLO is constructed by depthwise separable convolution and more attention to fire feature information extraction from a three-dimensional(3D) perspective, which improves the network feature extraction capability, achieves a balance of accuracy and speed, and reduces model parameters. In addition, a new activation function is used to improve the accuracy of fire recognition while minimizing the inference time of the network. All models are trained and validated …on a custom fire dataset and fire inference is performed on the CPU. The mean Average Precision(mAP) of the proposed model reaches 80.92%, which has a great advantage compared with Faster R-CNN. Compared with YOLOv3-Tiny, the proposed model decreases the number of parameters by 5.71 M and improves the mAP by 6.67%. Compared with YOLOv4-Tiny, the number of parameters decreases by 3.54 M, mAP increases by 8.47%, and inference time decreases by 62.59 ms. Compared with YOLOv5s, the difference in the number of parameters is nearly twice reduced by 4.45 M and the inference time is reduced by 41.87 ms. Compared with YOLOX-Tiny, the number of parameters decreases by 2.5 M, mAP increases by 0.7%, and inference time decreases by 102.49 ms. Compared with YOLOv7, the number of parameters decreases significantly and the balance of accuracy and speed is achieved. Compared with YOLOv7-Tiny, the number of parameters decreases by 3.64 M, mAP increases by 0.5%, and inference time decreases by 15.65 ms. The experiment verifies the superiority and effectiveness of Conflagration-YOLO compared to the state-of-the-art (SOTA) network model. Furthermore, our proposed model and its dimensional variants can be applied to computer vision downstream target detection tasks in other scenarios as required. Show more

Keywords: Target detection, Conflagration-YOLO, light weigh, real-time detection

DOI: 10.3233/AIC-230094

Citation: AI Communications, vol. 36, no. 4, pp. 361-376, 2023

Price: EUR 27.50

AI Communications - Volume 36, issue 4

Classifying falls using out-of-distribution detection in human activity recognition

Fully Automated Neural Network Framework for Pulmonary Nodules Detection and Segmentation

DW: Detected weight for 3D object detection

Multi-scale spatio-temporal network for skeleton-based gait recognition

Dynamic finegrained structured pruning sensitive to filter texture distribution

TMTrans: texture mixed transformers for medical image segmentation

Dual cross-domain session-based recommendation with multi-channel integration

Conflagration-YOLO: a lightweight object detection architecture for conflagration

North America

Europe

Asia