Contrastive Learning and Reward Smoothing for Deep Portfolio Management

Contrastive Learning and Reward Smoothing for Deep Portfolio Management

Yun-Hsuan Lien, Yuan-Kui Li, Yu-Shuen Wang

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 3966-3974. https://doi.org/10.24963/ijcai.2023/441

In this study, we used reinforcement learning (RL) models to invest assets in order to earn returns. The models were trained to interact with a simulated environment based on historical market data and learn trading strategies. However, using deep neural networks based on the returns of each period can be challenging due to the unpredictability of financial markets. As a result, the policies learned from training data may not be effective when tested in real-world situations. To address this issue, we incorporated contrastive learning and reward smoothing into our training process. Contrastive learning allows the RL models to recognize patterns in asset states that may indicate future price movements. Reward smoothing, on the other hand, serves as a regularization technique to prevent the models from seeking immediate but uncertain profits. We tested our method against various traditional financial techniques and other deep RL methods, and found it to be effective in both the U.S. stock market and the cryptocurrency market. Our source code is available at https://github.com/sophialien/FinTech-DPM.
Keywords:
Machine Learning: ML: Reinforcement learning
Machine Learning: ML: Relational learning
Machine Learning: ML: Representation learning