Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Hou, Linlina; b; * | Zhang, Haixiangb | Hou, Qing-Hub | Guo, Alan J.X.b | Wu, Oub | Yu, Tinga | Zhang, Jic; *
Affiliations: [a] Zhejiang Laboratory, Hangzhou, Zhejiang, China | [b] Center for Applied Mathematics, Tianjin University, Tianjin, China | [c] University of Southern Queensland, Australia
Correspondence: [*] Corresponding authors: Linlin Hou, Zhejiang Laboratory, Hangzhou, Zhejiang, China. E-mail: houlinlin@zhejianglab.com. Ji Zhang, University of Southern Queenslandy, Australia. E-mail: Ji.Zhang@usq.edu.au.
Abstract: Graph Convolutional Network (GCN) is an important method for learning graph representations of nodes. For large-scale graphs, the GCN could meet with the neighborhood expansion phenomenon, which makes the model complexity high and the training time long. An efficient solution is to adopt graph sampling techniques, such as node sampling and random walk sampling. However, the existing sampling methods still suffer from aggregating too many neighbor nodes and ignoring node feature information. Therefore, in this paper, we propose a new subgraph sampling method, namely, Similarity-Aware Random Walk (SARW), for GCN with large-scale graphs. A novel similarity index between two adjacent nodes is proposed, describing the relationship of nodes with their neighbors. Then, we design a sampling probability expression between adjacent nodes using node feature information, degree information, neighbor set information, etc. Moreover, we prove the unbiasedness of the SARW-based GCN model for node representations. The simplified version of SARW (SSARW) has a much smaller variance, which indicates the effectiveness of our subgraph sampling method in large-scale graphs for GCN learning. Experiments on six datasets show our method achieves superior performance over the state-of-the-art graph sampling approaches for the large-scale graph node classification task.
Keywords: Similarity-Aware Random Walk, subgraph sampling, Graph Convolutional Network, large-scale graphs, random walk
DOI: 10.3233/IDA-227085
Journal: Intelligent Data Analysis, vol. 27, no. 6, pp. 1615-1636, 2023
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl