Specifying and Testing k-Safety Properties for Machine-Learning Models

Specifying and Testing k-Safety Properties for Machine-Learning Models

Maria Christakis, Hasan Ferit Eniser, Jörg Hoffmann, Adish Singla, Valentin Wüstholz

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 4748-4757. https://doi.org/10.24963/ijcai.2023/528

Machine-learning models are becoming increasingly prevalent in our lives, for instance assisting in image-classification or decision-making tasks. Consequently, the reliability of these models is of critical importance and has resulted in the development of numerous approaches for validating and verifying their robustness and fairness. However, beyond such specific properties, it is challenging to specify, let alone check, general functional-correctness expectations from models. In this paper, we take inspiration from specifications used in formal methods, expressing functional-correctness properties by reasoning about k different executions---so-called k-safety properties. Considering a credit-screening model of a bank, the expected property that "if a person is denied a loan and their income decreases, they should still be denied the loan" is a 2-safety property. Here, we show the wide applicability of k-safety properties for machine-learning models and present the first specification language for expressing them. We also operationalize the language in a framework for automatically validating such properties using metamorphic testing. Our experiments show that our framework is effective in identifying property violations, and that detected bugs could be used to train better models.
Keywords:
Multidisciplinary Topics and Applications: MDA: Software engineering
Agent-based and Multi-agent Systems: MAS: Engineering methods, platforms, languages and tools
AI Ethics, Trust, Fairness: ETF: Safety and robustness