On the Study of Curriculum Learning for Inferring Dispatching Policies on the Job Shop Scheduling

On the Study of Curriculum Learning for Inferring Dispatching Policies on the Job Shop Scheduling

Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal Ochoa de Retana, Martin Takac

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence
Main Track. Pages 5350-5358. https://doi.org/10.24963/ijcai.2023/594

This paper studies the use of Curriculum Learning on Reinforcement Learning (RL) to improve the performance of the dispatching policies learned on the Job-shop Scheduling Problem (JSP). Current works in the literature present a large optimality gap when learning end-to-end solutions on this problem. In this regard, we identify the difficulty for RL to learn directly on large instances as part of the issue and use Curriculum Learning (CL) to mitigate this effect. Particularly, CL sequences the learning process in a curriculum of increasing complexity tasks, which allows learning on large instances that otherwise would be impossible to learn from scratch. In this paper, we present a size-agnostic model that enables us to demonstrate that current curriculum strategies have a major impact on the quality of the solution inferred. In addition, we introduce a novel Reinforced Adaptive Staircase Curriculum Learning (RASCL) strategy, which adjusts the difficulty level during the learning process by revisiting the worst-performing instances. Conducted experiments on Taillard’s and Demirkol’s datasets show that the presented approach significantly improves the current stateof-the-art models on the JSP. It reduces the average optimality gap from 19.35% to 10.46% on Taillard’s instances and from 38.43% to 18.85% on Demirkol’s instances.
Keywords:
Planning and Scheduling: PS: Learning in planning and scheduling
Planning and Scheduling: PS: Scheduling