reproducibilityindex.ai

MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

Authors: Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song, Yining Ma, Jie Zhang, Xu Chi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants, and showcases decent results on the fewshot setting and real-world benchmark instances. We further conduct extensive studies on the effect of Mo E configurations in solving VRPs, and observe the superiority of hierarchical gating when facing out-of-distribution data.
Researcher Affiliation	Collaboration	1College of Computing and Data Science, Nanyang Technological University, Singapore 2School of Computing and Information Systems, Singapore Management University, Singapore 3Department of Information Systems, Eindhoven University of Technology, The Netherlands 4Institute of Marine Science and Technology, Shandong University, China 5Singapore Institute of Manufacturing Technology (SIMTech), Agency for Science, Technology and Research (A*STAR), Singapore.
Pseudocode	No	The paper describes its methods but does not contain any structured pseudocode or algorithm blocks (e.g., 'Algorithm 1').
Open Source Code	Yes	The source code is available at: https://github.com/RoyalSkye/Routing-MVMo E.
Open Datasets	Yes	We evaluate all neural solvers on CVRPLIB benchmark dataset, including CVRP and VRPTW instances with various problem sizes and attribute distributions. We mainly consider the classic Set-X (Uchoa et al., 2017) and Set-Solomon (Solomon, 1987). ... We present more details of VRP variants and the associated data generation process in Appendix A.
Dataset Splits	No	The paper mentions training on '100M training instances' and evaluating on a 'test dataset that contains 1K instances', and shows 'validation curves' in Fig. 7. However, it does not provide specific split percentages or counts for a distinct validation set from the overall dataset for reproducibility.
Hardware Specification	Yes	All experiments are conducted on a machine with NVIDIA Ampere A100-80GB GPU cards and AMD EPYC 7513 CPU at 2.6GHz.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and implies a Python-based implementation using 'PyTorch' concepts (like 'TensorFlow' or 'PyTorch' often imply Transformer architecture elements), but it does not specify version numbers for these software components or other libraries. It also mentions 'HGS (Vidal, 2022)', 'LKH3 (Helsgaun, 2017)', and 'OR-Tools (Furnon & Perron, 2023)' but without specific version numbers for these solvers.
Experiment Setup	Yes	Adam optimizer is used with the learning rate of 1e 4, the weight decay of 1e 6, and the batch size of 128. The model is trained for 5000 epochs, with each containing 20000 training instances (i.e., 100M training instances in total). The learning rate is decayed by 10 for the last 10% training instances. We consider two problem scales n {50, 100} during training... We employ m = 4 experts with K = β = 2 in each Mo E layer, and set the the weight α of the auxiliary loss Lb as 0.01.