Self-Labeling the Job Shop Scheduling Problem

Authors: Andrea Corsini, Angelo Porrello, SIMONE CALDERARA, Mauro Dell'Amico

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate this Self-Labeling Improvement Method (SLIM) on the Job Shop Scheduling (JSP), a complex combinatorial problem that is receiving much attention from the neural combinatorial community. We propose a generative model based on the well-known Pointer Network and train it with SLIM. Experiments on popular benchmarks demonstrate the potential of this approach as the resulting models outperform constructive heuristics and state-of-the-art learning proposals for the JSP.
Researcher Affiliation Academia University of Modena and Reggio Emilia, Italy
Pseudocode No The paper describes the proposed methods in detail but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at: https://github.com/Andrea Corsini1/Self Labeling Job Shop
Open Datasets Yes To train our model, we created a dataset of 30 000 instances as in [46] by randomly generating 5000 instances per shape (n m) in the set: {10 10, 15 10, 15 15, 20 10, 20 15, 20 20}. While our training strategy does not strictly require a fixed dataset, we prefer using it to enhance reproducibility.
Dataset Splits Yes During training and validation, we fix the number of sampled solutions β to 256 and save the parameters producing the lower average makespan on a hold-out set comprising 100 random instances per shape included in our dataset.
Hardware Specification Yes All the experiments were performed on an Ubuntu 22.04 machine equipped with an Intel Core i9-11900K and an NVIDIA Ge Force RTX 3090 GPU having 24 GB of memory.
Software Dependencies No The paper mentions 'Ubuntu 22.04', 'Adam optimizer [19]', 'Google OR tools 9.8', and 'Gurobi 9.5', but does not provide specific version numbers for all key software components such as PyTorch or other machine learning libraries used for development.
Experiment Setup Yes Our encoder consists of two GAT layers [8], both with 3 attention heads and leaky slope at 0.15. In GAT1, we set the size of each head to 64 and concatenate their outputs; while in GAT2, we increase the head s size to 128 and average their output to produce ei R143 (h = 15 + 128). Inside the decoder s memory network, the MHA layer follows [49] but it concatenates the output of 3 heads with 64 neurons each, while W1 R11 192 and W2 R192 128 use 192 and 128 neurons, respectively. Thus, the job states sj Rd have size d = 128. Finally, the classifier FNN features a dense layer with 128 neurons activated through the Leaky-Re LU (slope = 0.15) and a final linear with 1 neuron. We train this generative model with SLIM (see Sec. 4.2) on our dataset for 20 epochs, utilizing the Adam optimizer [19] with a constant learning rate of 0.0002. In each training step, we accumulate gradients over a batch of size 16, meaning that we update the model parameters θ after processing 16 instances. During training and validation, we fix the number of sampled solutions β to 256 and save the parameters producing the lower average makespan on a hold-out set comprising 100 random instances per shape included in our dataset.