Self-Labeling the Job Shop Scheduling Problem

Read original: arXiv:2401.11849 - Published 7/9/2024 by Andrea Corsini, Angelo Porrello, Simone Calderara, Mauro Dell'Amico

🌐

Overview

Proposes a self-supervised training strategy for combinatorial optimization problems
Targets the Job Shop Scheduling (JSP) problem, a complex combinatorial problem
Aims to train generative models without the need for expensive optimal solutions

Plain English Explanation

In many combinatorial optimization problems, like Job Shop Scheduling, the main challenge is obtaining high-quality target solutions to train machine learning models. These optimal solutions are often produced by costly exact solvers, making them difficult to acquire at scale.

The researchers propose a self-supervised learning approach that avoids this requirement. Instead of relying on expensive optimal solutions, the model is trained by generating multiple candidate solutions and using the best one (according to the problem objective) as a "pseudo-label." This allows the model to iteratively improve its solution generation capabilities through self-supervision, without needing access to optimality information.

The team tests this "Self-Labeling" strategy on the Job Shop Scheduling problem, using a generative Pointer Network model. The results show that this self-supervised approach can outperform both constructive heuristics and other state-of-the-art learning-based methods for JSP, demonstrating the potential of this technique for tackling complex combinatorial problems.

Technical Explanation

The researchers propose a self-supervised training strategy to address the challenge of obtaining optimal solutions for combinatorial optimization problems. Inspired by semi-supervised and self-supervised learning, they introduce a "Self-Labeling" approach that allows generative models to be trained without access to expensive target solutions.

The key idea is to have the model generate multiple candidate solutions, and then use the best one (according to the problem objective) as a pseudo-label to train the model. This self-supervision process allows the model to iteratively improve its solution generation capabilities, without relying on optimality information.

The team focuses on the Job Shop Scheduling (JSP) problem, a complex combinatorial optimization challenge that has received significant attention from the Reinforcement Learning community. They use a Pointer Network as the generative model and train it using the proposed Self-Labeling strategy.

Experiments on popular JSP benchmarks demonstrate that the resulting models can outperform both constructive heuristics and current state-of-the-art learning-based approaches. This suggests the effectiveness of the self-supervised training strategy for tackling complex combinatorial optimization problems.

Critical Analysis

The paper presents a promising self-supervised approach for training generative models on combinatorial optimization problems, but there are a few potential limitations and areas for further research:

The effectiveness of the Self-Labeling strategy may be dependent on the specific problem and the quality of the candidate solutions generated by the model. More research is needed to understand the boundaries of this approach and how it scales to larger, more complex combinatorial problems.
The paper focuses on the Job Shop Scheduling problem, but it would be valuable to explore the performance of this self-supervised strategy on a wider range of combinatorial optimization problems to better assess its generalizability.
While the self-supervised approach avoids the need for optimal solutions, it still requires the model to generate a pool of candidate solutions, which may be computationally expensive for some problems. Incorporating more efficient solution generation techniques could further improve the scalability of this approach.

Overall, the proposed self-supervised training strategy shows promise for tackling complex combinatorial optimization challenges, but additional research is needed to fully understand its limitations and identify opportunities for further improvement.

Conclusion

This work presents a novel self-supervised training approach for combinatorial optimization problems that aims to overcome the challenge of obtaining expensive optimal solutions. By training generative models to iteratively improve their solution generation capabilities through self-supervision, the researchers demonstrate the potential of this strategy on the Job Shop Scheduling problem, where the resulting models outperform both constructive heuristics and current state-of-the-art learning-based methods.

This self-supervised approach offers an exciting alternative to traditional supervised learning techniques for combinatorial optimization, potentially opening the door to more scalable and accessible machine learning solutions for a wide range of complex real-world problems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌐

Self-Labeling the Job Shop Scheduling Problem

Andrea Corsini, Angelo Porrello, Simone Calderara, Mauro Dell'Amico

In this work, we propose a Self-Supervised training strategy specifically designed for combinatorial problems. One of the main obstacles in applying supervised paradigms to such problems is the requirement of expensive target solutions as ground-truth, often produced with costly exact solvers. Inspired by Semi- and Self-Supervised learning, we show that it is possible to easily train generative models by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label. In this way, we iteratively improve the model generation capability by relying only on its self-supervision, completely removing the need for optimality information. We prove the effectiveness of this Self-Labeling strategy on the Job Shop Scheduling (JSP), a complex combinatorial problem that is receiving much attention from the Reinforcement Learning community. We propose a generative model based on the well-known Pointer Network and train it with our strategy. Experiments on popular benchmarks demonstrate the potential of this approach as the resulting models outperform constructive heuristics and current state-of-the-art learning proposals for the JSP.

7/9/2024

Learning to Solve Job Shop Scheduling under Uncertainty

Guillaume Infantes, St'ephanie Roussel, Pierre Pereira, Antoine Jacquet, Emmanuel Benazera

Job-Shop Scheduling Problem (JSSP) is a combinatorial optimization problem where tasks need to be scheduled on machines in order to minimize criteria such as makespan or delay. To address more realistic scenarios, we associate a probability distribution with the duration of each task. Our objective is to generate a robust schedule, i.e. that minimizes the average makespan. This paper introduces a new approach that leverages Deep Reinforcement Learning (DRL) techniques to search for robust solutions, emphasizing JSSPs with uncertain durations. Key contributions of this research include: (1) advancements in DRL applications to JSSPs, enhancing generalization and scalability, (2) a novel method for addressing JSSPs with uncertain durations. The Wheatley approach, which integrates Graph Neural Networks (GNNs) and DRL, is made publicly available for further research and applications.

4/3/2024

LLMs can Schedule

Henrik Abgaryan, Ararat Harutyunyan, Tristan Cazenave

The job shop scheduling problem (JSSP) remains a significant hurdle in optimizing production processes. This challenge involves efficiently allocating jobs to a limited number of machines while minimizing factors like total processing time or job delays. While recent advancements in artificial intelligence have yielded promising solutions, such as reinforcement learning and graph neural networks, this paper explores the potential of Large Language Models (LLMs) for JSSP. We introduce the very first supervised 120k dataset specifically designed to train LLMs for JSSP. Surprisingly, our findings demonstrate that LLM-based scheduling can achieve performance comparable to other neural approaches. Furthermore, we propose a sampling method that enhances the effectiveness of LLMs in tackling JSSP.

8/14/2024

New!Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling

Jesse van Remmerden, Zaharah Bukhsh, Yingqian Zhang

The Job Shop Scheduling Problem (JSSP) is a complex combinatorial optimization problem. There has been growing interest in using online Reinforcement Learning (RL) for JSSP. While online RL can quickly find acceptable solutions, especially for larger problems, it produces lower-quality results than traditional methods like Constraint Programming (CP). A significant downside of online RL is that it cannot learn from existing data, such as solutions generated from CP, requiring them to train from scratch, leading to sample inefficiency and making them unable to learn from more optimal examples. We introduce Offline Reinforcement Learning for Learning to Dispatch (Offline-LD), a novel approach for JSSP that addresses these limitations. Offline-LD adapts two CQL-based Q-learning methods (mQRDQN and discrete mSAC) for maskable action spaces, introduces a new entropy bonus modification for discrete SAC, and exploits reward normalization through preprocessing. Our experiments show that Offline-LD outperforms online RL on both generated and benchmark instances. By introducing noise into the dataset, we achieve similar or better results than those obtained from the expert dataset, indicating that a more diverse training set is preferable because it contains counterfactual information.

9/18/2024