Efficient and Robust High-Dimensional Linear Contextual Bandits

Efficient and Robust High-Dimensional Linear Contextual Bandits

Cheng Chen, Luo Luo, Weinan Zhang, Yong Yu, Yijiang Lian

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Main track. Pages 4259-4265. https://doi.org/10.24963/ijcai.2020/588

The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) updating time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.
Keywords:
Uncertainty in AI: Sequential Decision Making
Machine Learning: Online Learning