Attention, Learn to Solve Routing Problems!
Authors: Wouter Kool, Herke van Hoof, Max Welling
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We significantly improve over recent learned heuristics for the Travelling Salesman Problem (TSP), getting close to optimal results for problems up to 100 nodes. With the same hyperparameters, we learn strong heuristics for two variants of the Vehicle Routing Problem (VRP), the Orienteering Problem (OP) and (a stochastic variant of) the Prize Collecting TSP (PCTSP), outperforming a wide range of baselines and getting results close to highly optimized and specialized algorithms. |
| Researcher Affiliation | Collaboration | Wouter Kool University of Amsterdam ORTEC w.w.m.kool@uva.nl Herke van Hoof University of Amsterdam h.c.vanhoof@uva.nl Max Welling University of Amsterdam CIFAR m.welling@uva.nl |
| Pseudocode | Yes | Algorithm 1 REINFORCE with Rollout Baseline 1: Input: number of epochs E, steps per epoch T , batch size B, significance α 2: Init θ, θBL θ 3: for epoch = 1, . . . , E do 4: for step = 1, . . . , T do 5: si Random Instance() i {1, . . . , B} 6: πi Sample Rollout(si, pθ) i {1, . . . , B} 7: πBL i Greedy Rollout(si, pθBL) i {1, . . . , B} 8: L PB i=1 L(πi) L(πBL i ) θ log pθ(πi) 9: θ Adam(θ, L) 10: end for 11: if One Sided Paired TTest(pθ, pθBL) < α then 12: θBL θ 13: end if 14: end for |
| Open Source Code | Yes | Our code in Py Torch (Paszke et al., 2017) is publicly available.2 https://github.com/wouterkool/attention-learn-to-route |
| Open Datasets | No | For all TSP instances, the n node locations are sampled uniformly at random in the unit square. This distribution is chosen to be neither easy nor artificially hard and to be able to compare to other learned heuristics. |
| Dataset Splits | Yes | We use a validation set of size 10000 with greedy decoding, and compare to using an exponential (β = 0.8) and a critic (see Appendix B.1) baseline. |
| Hardware Specification | Yes | For TSP, an epoch takes 5:30 minutes for n = 20, 16:20 for n = 50 (single GPU 1080Ti) and 27:30 for n = 100 (on 2 1080Ti s). ... (1080Ti) or 32 instances in parallel on a 32 virtual CPU system (2 Xeon E5-2630). |
| Software Dependencies | No | Our code in Py Torch (Paszke et al., 2017) is publicly available.2 ... For the C++ Iterated Local Search (ILS) algorithm, we perform 1 run as this takes already 2 minutes per instance (single thread) on average. For the Python ILS algorithm we perform 10 runs as this algorithm is fast. |
| Experiment Setup | Yes | Hyperparameters We initialize parameters Uniform( 1/ d), with d the input dimension. Every epoch we process 2500 batches of 512 instances (except for VRP with n = 100, where we use 2500 256 for memory constraints). ... We train for 100 epochs using training data generated on the fly. ... We use N = 3 layers in the encoder... We use a constant learning rate η = 10 4. Training with a higher learning rate η = 10 3 is possible and speeds up initial learning, but requires decay (0.96 per epoch) to converge and may be a bit more unstable. ... With the rollout baseline, we use an exponential baseline (β = 0.8) during the first epoch... |