Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Cheng, Zhaoa; b | Chen, Guanlina; b; * | Weng, Wenyongb | Lu, Qic | Yang, Wujianb
Affiliations: [a] School of Computing Science and Artificial Intelligence, Changzhou University, Changzhou, Jiangsu, China | [b] School of Computer and Computing Science, Zhejiang University City College, Hangzhou, Zhejiang, China | [c] China National Air Separation Engineering Co., Ltd, Hangzhou, Zhejiang, China
Correspondence: [*] Corresponding author: Guanlin Chen, School of Computer and Computing Science, Zhejiang University City College, Hangzhou, Jiangsu 310015, China. E-mail: chenguanlin@zucc.edu.cn.
Abstract: Recently, AWD-LSTM (ASGD Weight-Dropped LSTM) has achieved good result in the language model, and many AWD-LSTM based models have obtained state-of-the-art perplexities. However, in fact, large-scale neural language models have been shown to be prone to overfitting. In AWD-LSTM original paper, the author decided to adopt the way of retraining calling finetune to get a better result. In this paper, we present a simple yet effective parameter rollback mechanism for neural language models. And we introduce the parameter rollback averaged stochastic gradient descent (PR-ASGD), wherein the parameter “step” in ASGD will decrease according to a certain probability. Using this strategy, we achieve better word level perplexities on Penn Treebank: 56.26 based on AWD-LSTM model and 53.57 based on AWD-LSTM-MoS (AWD-LSTM Mixture of Softmaxes) model.
Keywords: Optimizer, language model, machine learning
DOI: 10.3233/JCM-226215
Journal: Journal of Computational Methods in Sciences and Engineering, vol. 22, no. 6, pp. 2375-2385, 2022
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl