Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Ng, Willie; * | Dash, Manoranjan
Affiliations: School of Computer Engineering, Nanyang Technological University, Singapore
Correspondence: [*] Corrresponding author. E-mail: WillieNg@pmail.ntu.edu.sg.
Abstract: We investigate the problem of finding frequent patterns in a continuous stream of transactions. In the literature, two prominent approaches are often used: (a) perform approximate counting (e.g., lossy counting algorithm (LCA) of Manku and Motwani, VLDB 2002) by using a lower support threshold than the one given by the user, or (b) maintain a running sample (e.g., reservoir sampling (Algo-Z) of Vitter, TOMS 1985) and generate frequent itemsets from the sample on demand. Although both are known to be practically useful, to the best of our knowledge, there has been no comparison between them. In addition, we propose a distance based sampling algorithm (DSS). An empirical comparison study on the algorithms is performed using synthetic and benchmark datasets. Results show that DSS is consistently more accurate than LCA and Algo-Z, whereas LCA performs better than Algo-Z. An outcome of this study is a new algorithm CLCA. In LCA, the proper quantification of the error parameter, ε, is non-trival. CLCA is an attempt to exploit this fact in proposing a new customized LCA algorithm. Interestingly, CLCA outperforms all other algorithms (including DSS) in mining for the frequent itemsets of user's choice.
Keywords: Approximate counting, data streams, sampling
DOI: 10.3233/IDA-2010-0450
Journal: Intelligent Data Analysis, vol. 14, no. 6, pp. 749-771, 2010
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl