INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
Authors: Han Fang, Zhihao Song, Paul Weng, Yutong Ban
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that the proposed INVi T achieves a dominant generalization performance on both TSP and CVRP problems with various distributions and different problem scales. Code is avaiable at https://github.com/Kasumigaoka Utaha/INVi T. 5. Experimental Results To validate the generalizability of the proposed INVi T, we use a series of datasets across various scales and distributions. |
| Researcher Affiliation | Academia | 1Joint Institute of Michigan, Shanghai Jiao Tong University, Shanghai, China 2Duke Kunshan University, Jiangsu, China. Correspondence to: Paul Weng <paul.weng@dukekunshan.edu.cn>, Yutong Ban <yban@sjtu.edu.cn>. |
| Pseudocode | No | The paper describes the model architecture and algorithm in text and through diagrams (Figure 3), but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | Code is avaiable at https://github.com/Kasumigaoka Utaha/INVi T. |
| Open Datasets | Yes | MSVDRP Dataset. We have produced a dataset called Multi-Scale Various-Distribution Routing Problem (MSVDRP) dataset... Public Datasets. Furthermore, we also use public datasets: TSPLIB and CVRPLIB to validate the performance. These instances have diverse problem scales and adhere to realworld distributions. For TSP, we include all symmetric instances in TSPLIB95 (Reinelt, 1991)... For CVRP, we include all instances in CVRPLIB Set-X by Uchoa et al. (2017)... |
| Dataset Splits | No | The paper uses the term 'validate' for overall performance evaluation but does not specify distinct training, validation, and test splits with percentages, counts, or a cross-validation strategy for model tuning or early stopping during training. It primarily describes training on specific datasets and then evaluating on others for generalization. |
| Hardware Specification | Yes | All the experiments are performed on the same machine, equipped with a single Intel Core i7-12700 CPU and a single RTX 4090 GPU. |
| Software Dependencies | Yes | For all evaluated methods, we keep the Pytorch version 1.12 on Python 3.9. |
| Experiment Setup | Yes | For the proposed INVi T, the initial learning rate is set to 10 4, with a weight decay of 0.01. The model is trained for 1.5 105 steps, with a batch size of 128... For INVi T, each single-view encoder contains 2 attention layers and the decoder contains 3 attention layers. The number of dimensions for features is 128 and the number of dimensions for feed forward layers is 512. The number of heads for each multi-head attention layer is 8. The default augmentation size is 8 and the default batch size is 64. For the whole training procedure, we train 500 epochs with 300 steps for each epoch. The default learning rate is 10 4. |