Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Deng, Zhi-Hong | Gao, Ning | Xu, Xiao-Ran
Affiliations: Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China. zhdeng@cis.pku.edu.cn | School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Note: [] Address for correspondence: Room 2318, Science Building 2, Peking University, Beijing, China
Abstract: Mining frequent patterns in database has emerged as an important task in knowledge discovery and data mining. In this paper, we present an efficient algorithm called Mop for fast frequent pattern discovery. Mop utilizes a new kind of data structure called OP_tree (ordered pattern tree) and some particular properties of frequent patterns to facilitate the process of mining frequent patterns. An OP_tree is a special frequent pattern tree, where the children of any node are sorted according to the supports of corresponding items. Efficiency of Mop is achieved with three techniques: (1) it adopts OP_tree to store a large database to avoid repetitive database scans, (2) it finds all frequent 2-patterns in the construction of OP_tree to avoid the costly generation of a large number of candidate 2-patterns, (3) the supports of candidate k-patterns (k>2) can be obtained by traversing a few of specific subtrees of the OP_tree, which greatly reduces the search space and avoid multi-scans of a database. We experimentally compare our algorithm with the Apriori algorithm and the FP-growth algorithm on one real database and one synthetical database. The experimental results show that Mop is about an order of magnitude faster than the Apriori algorithm. Mop also outperforms the FP-growth algorithm, especially when support threshold is very low and databases are quite large.
Keywords: Data Mining, Frequent Patterns, Ordered Pattern Trees
DOI: 10.3233/FI-2011-568
Journal: Fundamenta Informaticae, vol. 111, no. 4, pp. 373-390, 2011
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl