Bandit Online Learning on Graphs via Adaptive Optimization

Bandit Online Learning on Graphs via Adaptive Optimization

Peng Yang, Peilin Zhao, Xin Gao

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 2991-2997. https://doi.org/10.24963/ijcai.2018/415

Traditional online learning on graphs adapts graph Laplacian into ridge regression, which may not guarantee reasonable accuracy when the data are adversarially generated. To solve this issue, we exploit an adaptive optimization framework for online classification on graphs. The derived model can achieve a min-max regret under an adversarial mechanism of data generation. To take advantage of the informative labels, we propose an adaptive large-margin update rule, which enjoys a lower regret than the algorithms using error-driven update rules. However, this algorithm assumes that the full information label is provided for each node, which is violated in many practical applications where labeling is expensive and the oracle may only tell whether the prediction is correct or not. To address this issue, we propose a bandit online algorithm on graphs. It derives per-instance confidence region of the prediction, from which the model can be learned adaptively to minimize the online regret. Experiments on benchmark graph datasets show that the proposed bandit algorithm outperforms state-of-the-art competitors, even sometimes beats the algorithms using full information label feedback.
Keywords:
Machine Learning: Online Learning
Uncertainty in AI: Uncertainty in AI
Machine Learning Applications: Networks