ABSTRACT
We introduce a new algorithm for mining sequential patterns. Our algorithm is especially efficient when the sequential patterns in the database are very long. We introduce a novel depth-first search strategy that integrates a depth-first traversal of the search space with effective pruning mechanisms.Our implementation of the search strategy combines a vertical bitmap representation of the database with efficient support counting. A salient feature of our algorithm is that it incrementally outputs new frequent itemsets in an online fashion.In a thorough experimental evaluation of our algorithm on standard benchmark data from the literature, our algorithm outperforms previous work up to an order of magnitude.
- R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. In Proceedings of the Twentieth International Conference on Very Large Databases, pages 487--499, Santiago, Chile, 1994.]] Google ScholarDigital Library
- R. Agrawal and R. Srikant. Mining Sequential Patterns. In ICDE 1995, Taipei, Taiwan, March 1995.]] Google ScholarDigital Library
- R. J. Bayardo. Efficiently mining long patterns from databases. In SIGMOD 1998, pages 85--93, 1998.]] Google ScholarDigital Library
- C. Bettini, X. S. Wang, and S. Jajodia. Mining temporal relationships with multiple granularities in time sequences. Data Engineering Bulletin, 21(1):32--38, 1998.]]Google Scholar
- D. Burdick, M. Calimlim, and J. Gehrke. Mafia: A maximal frequent itemset algorithm for transactional databases. In ICDE 2001, Heidelberg, Germany, 2001.]] Google ScholarDigital Library
- M. Garofalakis, R. Rastogi, and K. Shim. SPIRIT: Sequential pattern mining with regular expression constraints. In VLDB 1999, pages 223--234, San Francisco, Sept. 1999. Morgan Kaufmann.]] Google Scholar
- J. Han, G. Dong, and Y. Yin. Efficient mining of partial periodic patterns in time series database. In ICDE 1999, pages 106--115, Sydney, Australia, Mar. 1999.]] Google ScholarDigital Library
- H. Mannila, H. Toivonen, and A. I. Verkamo. Discovering frequent episodes in sequences. In KDD 1995, pages 210--215, Montreal, Quebec, Canada, 1995.]] Google ScholarDigital Library
- J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu. PrefixSpan mining sequential patterns efficiently by prefix projected pattern growth. In ICDE 2001, pages 215--226, Heidelberg, Germany, Apr. 2001.]] Google ScholarDigital Library
- R. Srikant and R. Agrawal. Mining Sequential Patterns: Generalizations and Performance Improvements. In EDBT 1996, Avignon, France, March 1996.]] Google ScholarDigital Library
- R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. In P. M. G. Apers, M. Bouzeghoub, and G. Gardarin, editors, EDBT 1996, pages 3--17, 25--29 Mar. 1996.]] Google ScholarDigital Library
- M. J. Zaki. Spade: An efficient algorithm for mining frequent sequences. Machine Learning, 42(1/2):31--60, 2001.]] Google ScholarDigital Library
Index Terms
- Sequential PAttern mining using a bitmap representation
Recommendations
From sequential pattern mining to structured pattern mining: A pattern-growth approach
AbstractSequential pattern mining is an important data mining problem with broad applications. However, it is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. ...
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
Sequential pattern mining is an important data mining problem with broad applications. However, it is also a difficult problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Most of the ...
Closed Multidimensional Sequential Pattern Mining
ITNG '06: Proceedings of the Third International Conference on Information Technology: New GenerationsWe propose a new method, called closed multidimensional sequential pattern mining, for mining multidimensional sequential patterns. The new method is an integration of closed sequential pattern mining and closed itemset pattern mining. Based on this ...
Comments