Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Memory-Enhanced Neural Solvers for Routing Problems
Authors: Felix Chalumeau, Refiloe Shabe, Noah De Nicola, Arnu Pretorius, Tom Barrett, Nathan Grinsztajn
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate its effectiveness on the Traveling Salesman and Capacitated Vehicle Routing problems, demonstrating its superiority over tree-search and policy-gradient fine-tuning; and showing that it can be zero-shot combined with diversity-based solvers. We successfully train all RL auto-regressive solvers on large instances, and verify MEMENTO s scalability and data-efficiency: pushing the state-of-the-art on 11 out of 12 evaluated tasks. |
| Researcher Affiliation | Collaboration | Felix Chalumeau 1 Refiloe Shabe1 Noah De Nicola 2 Arnu Pretorius1 Thomas D. Barrett 1 Nathan Grinsztajn 1 1Insta Deep 2University of Cape Town |
| Pseudocode | Yes | The details of the MEMENTO training procedure are presented in Algorithm 1 and can be understood as follows. |
| Open Source Code | Yes | Code availability We provide access to the code2 utilized for training our method and executing all baseline models. We release our checkpoints for all problem types and scales, accompanied by the necessary datasets to replicate our findings. We implement our method and experiments in JAX (Bradbury et al., 2018), along with test sets and checkpoints. Footnote 2: Code, checkpoints, and evaluation sets available at https://github.com/instadeepai/memento |
| Open Datasets | Yes | We use datasets of 10,000 instances with 100 cities/customer nodes drawn from the training distribution, and three generalization datasets of 1,000 instances of sizes 125, 150, and 200, all from benchmark sets frequently used in the literature (Kool et al., 2019; Kwon et al., 2020; Hottung et al., 2022; Grinsztajn et al., 2023; Chalumeau et al., 2023b). For TSP, we use the dataset from Fu et al. (2021). For CVRP, we use the dataset from Luo et al. (2023). |
| Dataset Splits | Yes | These instances feature the positions of 100 cities/customers uniformly sampled within the unit square. The benchmark also includes three datasets of distributions not encountered during training, each comprising 1000 problem instances with larger sizes: 125, 150, and 200, generated from a uniform distribution across the unit square. We compare MEMENTO and EAS on a set of 128 unseen instances (of size 500) |
| Hardware Specification | Yes | We use TPU v3-8 for our experiments. We thank Google s TPU Research Cloud (TRC) for supporting our research with Cloud TPUs. |
| Software Dependencies | No | We implement our method and experiments in JAX (Bradbury et al., 2018). The two problems are also JAX implementations from Jumanji (Bonnet et al., 2023). CMA-ES implementation to mix MEMENTO and COMPASS is taken from the research package QDax (Chalumeau et al., 2023a). Neural networks, optimizers, and many utilities are implemented using the Deep Mind JAX ecosystem (Babuschkin et al., 2020). |
| Experiment Setup | Yes | All hyperparameters can be found in Appendix C. We report all the hyper-parameters used during train and inference time. For our method MEMENTO, there is no training hyper-parameters to report for instance sizes 125, 150, and 200 as the model used was trained on instances of size 100. The hyper-parameters used for MEMENTO are reported in Table 10. |