Closing the BIG-LID: An Effective Local Intrinsic Dimensionality Defense for Nonlinear Regression Poisoning
Closing the BIG-LID: An Effective Local Intrinsic Dimensionality Defense for Nonlinear Regression Poisoning
Sandamal Weerasinghe, Tamas Abraham, Tansu Alpcan, Sarah M. Erfani, Christopher Leckie, Benjamin I. P. Rubinstein
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 3176-3184.
https://doi.org/10.24963/ijcai.2021/437
Nonlinear regression, although widely used in engineering, financial and security applications for automated decision making, is known to be vulnerable to training data poisoning. Targeted poisoning attacks may cause learning algorithms to fit decision functions with poor predictive performance. This paper presents a new analysis of local intrinsic dimensionality (LID) of nonlinear regression under such poisoning attacks within a Stackelberg game, leading to a practical defense. After adapting a gradient-based attack on linear regression that significantly impairs prediction capabilities to nonlinear settings, we consider a multi-step unsupervised black-box defense. The first step identifies samples that have the greatest influence on the learner's validation error; we then use the theory of local intrinsic dimensionality, which reveals the degree of being an outlier of data samples, to iteratively identify poisoned samples via a generative probabilistic model, and suppress their influence on the prediction function. Empirical validation demonstrates superior performance compared to a range of recent defenses.
Keywords:
Machine Learning: Adversarial Machine Learning
Multidisciplinary Topics and Applications: Security and Privacy
Data Mining: Anomaly/Outlier Detection