Mining Complex Patterns across Sequences with Gap Requirements

Xingquan Zhu, Xindong Wu

The recurring appearance of sequential patterns, when confined by the predefined gap requirements, often implies strong temporal correlations or trends among pattern elements. In this paper, we study the problem of mining a set of gap constrained sequen-tial patterns across multiple sequences. Given a set of sequences S1, S2,., SK constituting a single hyper-sequence S, we aim to find recurring patterns in S, say P, which may cross multiple sequences with all their matching characters in S bounded by the user specified gap constraints. Because of the combina-torial candidate explosion, traditional Apriori-based algorithms are computationally infeasible. Our research proposes a new mechanism to ensure pattern growing and pruning. When combining the pruning technique with our Gap Constrained Search (GCS) and map-based support prediction approaches, our method achieves a speed about 40 times faster than its other peers