An End-to-End Optimal Trade Execution Framework based on Proximal Policy Optimization

An End-to-End Optimal Trade Execution Framework based on Proximal Policy Optimization

Siyu Lin, Peter A. Beling

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Special Track on AI in FinTech. Pages 4548-4554. https://doi.org/10.24963/ijcai.2020/627

In this article, we propose an end-to-end adaptive framework for optimal trade execution based on Proximal Policy Optimization (PPO). We use two methods to account for the time dependencies in the market data based on two different neural network architecture: 1) Long short-term memory (LSTM) networks, 2) Fully-connected networks (FCN) by stacking the most recent limit orderbook (LOB) information as model inputs. The proposed framework can make trade execution decisions based on level-2 limit order book (LOB) information such as bid/ask prices and volumes directly without manually designed attributes as in previous research. Furthermore, we use a sparse reward function, which gives the agent reward signals at the end of each episode as an indicator of its relative performances against the baseline model, rather than implementation shortfall (IS) or a shaped reward function. The experimental results have demonstrated advantages over IS and the shaped reward function in terms of performance and simplicity. The proposed framework has outperformed the industry commonly used baseline models such as TWAP, VWAP, and AC as well as several Deep Reinforcement Learning (DRL) models on most of the 14 US equities in our experiments.
Keywords:
Foundation for AI in FinTech: Reinforcement learning for FinTech
AI for trading: AI for algorithmic trading