Combinatorial Optimization with Policy Adaptation using Latent Space Search

Authors: Felix Chalumeau, Shikha Surana, Clément Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Tom Barrett

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate COMPASS across three canonical problems Travelling Salesman, Capacitated Vehicle Routing, and Job-Shop Scheduling and demonstrate that our search strategy (i) outperforms state-of-the-art approaches in 9 out of 11 standard benchmarking tasks and (ii) generalizes better, surpassing all other approaches on a set of 18 procedurally transformed instance distributions.
Researcher Affiliation Industry Felix Chalumeau Insta Deep f.chalumeau@instadeep.com Shikha Surana Insta Deep s.surana@instadeep.com Clément Bonnet Insta Deep c.bonnet@instadeep.com Nathan Grinsztajn Insta Deep n.grinsztajn@instadeep.com Arnu Pretorius Insta Deep a.pretorius@instadeep.com Alexandre Laterre Insta Deep a.laterre@instadeep.com Thomas D. Barrett Insta Deep t.barrett@instadeep.com
Pseudocode Yes Full details of the algorithmic procedure can be found in Appendix F. Notably, our work is the first to create a specialized and diverse set of policies represented by a continuous latent space by only training the best-performing vector for each problem instance.
Open Source Code Yes We release the code1 used to train our method and to run all baselines. We also make our checkpoints available for all three problems, along with the datasets necessary to reproduce the results. 1Code, checkpoints and evaluation sets are available at https://github.com/instadeepai/compass
Open Datasets Yes for TSP and CVRP, we use datasets of 10 000 instances drawn from the training distribution, with the positions of 100 cities/customers uniformly sampled within the unit square, and three datasets not seen during training, each containing 1000 problem instances but with larger sizes: 125, 150 and 200, also generated from a uniform distribution over the unit square. We use the exact same datasets as in the literature.
Dataset Splits Yes We train COMPASS until convergence on the same training distribution as that used to train the initial checkpoint. For TSP and CVRP these are problem instances with 100 locations uniformly sampled within a unit square. For JSSP, we use the same training distribution used in EAS, which is an instance with 10 jobs and machines, and a maximum possible duration of 98.
Hardware Specification Yes Our code is optimized for TPU v3-8, which is the hardware used for our experiments.
Software Dependencies Yes For the three problems, we used the JAX (Bradbury et al., 2018) implementations from Jumanji (Bonnet et al., 2023) to leverage hardware accelerators (e.g. TPU).
Experiment Setup Yes A single set of hyperparameters is used across all problems, with full training details provided in Appendix G.