r-grams: Relational Grams

Niels Landwehr, Luc De Raedt

We introduce relational grams (r-grams). They upgrade n-grams for modeling relational sequences of atoms. As n-grams, r-grams are based on smoothed n-th order Markov chains. Smoothed distributions can be obtained by decreasing the order of the Markov chain as well as by relational generalization of the r-gram. To avoid sampling object identifiers in sequences, r-grams are generative models at the level of variablized sequences with local object identity constraints. These sequences define equivalence classes of ground sequences, in which elements are identical up to local identifier renaming. The proposed technique is evaluated in several domains, including mobile phone communication logs, Unix shell user modeling, and protein fold prediction based on secondary protein structure.