Exact Bernoulli Scan Statistics using Binary Decision Diagrams

Exact Bernoulli Scan Statistics using Binary Decision Diagrams

Masakazu Ishihata, Takanori Maehara

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 5737-5743. https://doi.org/10.24963/ijcai.2019/795

In combinatorial statistics, we are interested in a statistical test of combinatorial correlation, i.e., existence a subset from an underlying combinatorial structure such that the observation is large on the subset. The combinatorial scan statistics has been proposed for such a statistical test; however, it is not commonly used in practice because of its high computational cost. In this study, we restrict our attention to the case that the number of data points is moderately small (e.g., 50), the outcome is binary, and the underlying combinatorial structure is represented by a zero-suppressed binary decision diagram (ZDD), and consider the problem of computing the p-value of the combinatorial scan statistics exactly. First, we prove that this problem is a #P-hard problem. Then, we propose a practical algorithm that solves the problem. Here, the algorithm constructs a binary decision diagram (BDD) for a set of realizations of the random variables by a dynamic programming on the ZDD, and computes the p-value by a dynamic programming on the BDD. We conducted experiments to evaluate the performance of the proposed algorithm using real-world datasets.
Keywords:
Uncertainty in AI: Exact Probabilistic Inference
Knowledge Representation and Reasoning: Tractable Languages and Knowledge compilation