Learning Interpretable Representations with Informative Entanglements
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 1970-1976. https://doi.org/10.24963/ijcai.2020/273
Learning interpretable representations in an unsupervised setting is an important yet a challenging task. Existing unsupervised interpretable methods focus on extracting independent salient features from data. However they miss out the fact that the entanglement of salient features may also be informative. Acknowledging these entanglements can improve the interpretability, resulting in extraction of higher quality and a wider variety of salient features. In this paper, we propose a new method to enable Generative Adversarial Networks (GANs) to discover salient features that may be entangled in an informative manner, instead of extracting only disentangled features. Specifically, we propose a regularizer to punish the disagreement between the extracted feature interactions and a given dependency structure while training. We model these interactions using a Bayesian network, estimate the maximum likelihood parameters and calculate a negative likelihood score to measure the disagreement. Upon qualitatively and quantitatively evaluating the proposed method using both synthetic and real-world datasets, we show that our proposed regularizer guides GANs to learn representations with disentanglement scores competing with the state-of-the-art, while extracting a wider variety of salient features.
Machine Learning: Interpretability
Machine Learning: Deep Generative Models
Machine Learning: Deep Learning