Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Kim, Donggyu | Yun, Unil*
Affiliations: Department of Computer Engineering, Sejong University, Seoul, Korea
Correspondence: [*] Corresponding author: Unil Yun, Department of Computer Engineering, Sejong University, Seoul, Korea. E-mail:yunei@sejong.ac.kr
Abstract: Various association rule mining techniques, such as frequent itemset mining, sequence itemset mining, and high utility itemset mining, have been studied to reveal valuable knowledge hidden from large databases. Among these techniques, high utility itemset mining has been researched actively by many researchers because of its characteristics that can find more meaningful itemsets compared to those of other approaches by considering the utility of each item in a given database. In recent years, mining high utility itemsets over data streams has emerged as an interesting topic because many users want to obtain valuable information from stream data, which are continually generated at rapid rates. However, in these environments, most of the previous high utility itemset mining methods cannot efficiently work in terms of both runtime and memory usage. In addition, since they conduct their mining processes without any consideration of transactions' arrival-time, it is hard for these methods to sufficiently fulfill the needs of users when they want to obtain only up to date, relevant information over data streams. In this paper, we propose a new tree-based algorithm that mines recent high utility itemsets over data streams. On the basis of the time decaying model, our algorithm diminishes the utilities of transactions according to their arrival-time in order to assign larger weights to recent data compared to those of older ones. Moreover, the algorithm regularly updates the utility information in its tree data structure and prunes the nodes with the utility values less than a user-specified minimum value. Thereby, the algorithm can maintain a reasonable memory usage bound by avoiding memory use that is unessential. Experimental results demonstrate that our algorithm can mine recent high utility itemsets from varying stream data while consuming smaller computational resources than those of the existing algorithms.
Keywords: Data mining, high utility itemset mining, stream data mining, time decaying model
DOI: 10.3233/IDA-160861
Journal: Intelligent Data Analysis, vol. 20, no. 5, pp. 1157-1180, 2016
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl