Date
Tue, 23 Apr 2024
Time
14:00 - 14:30
Location
L3
Speaker
Zangir Iklassov
Organisation
Mohamed bin Zayed University of Artificial Intelligence

Our research explores the application of reinforcement learning (RL) strategies to solve complex combinatorial research problems, specifically the Job-shop Scheduling Problem (JSP) and the Stochastic Vehicle Routing Problem with Time Windows (SVRP). For JSP, we utilize Curriculum Learning (CL) to enhance the performance of dispatching policies. This approach addresses the significant optimality gap in existing end-to-end solutions by structuring the training process into a sequence of increasingly complex tasks, thus facilitating the handling of larger, more intricate instances. Our study introduces a size-agnostic model and a novel strategy, the Reinforced Adaptive Staircase Curriculum Learning (RASCL), which dynamically adjusts difficulty levels during training, focusing on the most challenging instances. Experimental results on Taillard and Demirkol datasets show that our approach reduces the average optimality gap to 10.46% and 18.85%, respectively.

For SVRP, we propose an end-to-end framework employing an attention-based neural network trained through RL to minimize routing costs while addressing uncertain travel costs and demands, alongside specific customer delivery time windows. This model outperforms the state-of-the-art Ant-Colony Optimization algorithm by achieving a 1.73% reduction in travel costs and demonstrates robustness across diverse environmental settings, making it a valuable baseline for future research. Both studies mark advancements in the application of machine learning techniques to operational research.

Please contact us with feedback and comments about this page. Last updated on 19 Apr 2024 15:24.