A Two-level Reinforcement Learning Algorithm for Ambiguous Mean-variance Portfolio Selection Problem
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Special Track on AI in FinTech. Pages 4527-4533. https://doi.org/10.24963/ijcai.2020/624
Traditional modeling on the mean-variance portfolio selection often assumes a full knowledge on statistics of assets' returns. It is, however, not always the case in real financial markets. This paper deals with an ambiguous mean-variance portfolio selection problem with a mixture model on the returns of risky assets, where the proportions of different component distributions are assumed to be unknown to the investor, but being constants (in any time instant). Taking into consideration the updates of proportions from future observations is essential to find an optimal policy with active learning feature, but makes the problem intractable when we adopt the classical methods. Using reinforcement learning, we derive an investment policy with a learning feature in a two-level framework. In the lower level, the time-decomposed approach (dynamic programming) is adopted to solve a family of scenario subcases where in each case the series of component distributions along multiple time periods is specified. At the upper level, a scenario-decomposed approach (progressive hedging algorithm) is applied in order to iteratively aggregate the scenario solutions from the lower layer based on the current knowledge on proportions, and this two-level solution framework is repeated in a manner of rolling horizon. We carry out experimental studies to illustrate the execution of our policy scheme.
AI for trading: AI for portfolio analytics