On Integrating Logical Analysis of Data into Random Forests

On Integrating Logical Analysis of Data into Random Forests

David Ing, Said Jabbour, Lakhdar Saïs

Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Main Track. Pages 394-402. https://doi.org/10.24963/ijcai.2025/45

Random Forests (RFs) are one of the most popular classifiers in machine learning. RF is an ensemble learning method that combines multiple Decision Trees (DTs), providing a more robust and accurate model than a single DT. However, one of the main step of RFs is the random selection of many different features during the construction phase of DTs, resulting in a forest with various features, which makes it difficult to extract short and concise explanations. In this paper, we propose integrating Logical Analysis of Data (LAD) into RFs. LAD is a pattern learning framework that combines optimization, Boolean functions, and combinatorial theory. One of its main goals is to generate minimal support sets (MSSes) that discriminate between different groups of data. More precisely, we show how to enhance the classical RF algorithm by randomly choosing MSSes rather than randomly choosing feature subsets that potentially contain irrelevant features for constructing DTs. Experiments on benchmark datasets reveal that integrating LAD into classical RFs using MSSes can maintain similar performance in terms of accuracy, produce forests of similar size, reduce the set of used features, and enable the extraction of significantly shorter explanations compared to classical RFs.
Keywords:
AI Ethics, Trust, Fairness: ETF: Explainability and interpretability
Machine Learning: ML: Explainable/Interpretable machine learning