Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances

Authors: Zhang-Hua Fu, Kai-Bin Qiu, Hongyuan Zha7474-7482

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results based on a large number of instances (with up to 10,000 vertices) show that, this new approach clearly outperforms the existing machine learning based TSP algorithms, and significantly improves the generalization ability of the trained model.
Researcher Affiliation Academia 1 Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China 2 Institute of Robotics and Intelligent Manufacturing, The Chinese University of Hong Kong, Shenzhen, China 3 School of Data Science, The Chinese University of Hong Kong, Shenzhen, China
Pseudocode No The paper describes procedures and includes diagrams (like Figure 3 'Procedure of the Monte Carlo tree search'), but no structured pseudocode or algorithm blocks are explicitly presented.
Open Source Code Yes Publicly available at https://github.com/Spider-scnu/TSP.
Open Datasets Yes We use two data sets: (1) Set 1 4, which is divided into three subsets, each containing 10,000 automatically generated 2DEuclidean TSP instances, respectively with n = 20, 50, 100. This data set is widely used by the existing learning based algorithms. (2) Set 2, following the same rules, we newly generate 400 larger instances, i.e., 128 instances respectively with n = 200, 500, 1000, and 16 instances with n = 10000. 4 Downloaded from https://drive.google.com/file/d/1-5WS5e7CKsJ9uY9uVXIyxgbcZZNYBrp/view.
Dataset Splits No The paper mentions a 'train set' and 'test instances' but does not specify a separate 'validation' set or explicit percentages/counts for data splits.
Hardware Specification Yes To ensure fair comparisons, all the learning based baselines as well as our new algorithm are uniformly executed on one GTX 1080 Ti GPU (to fully utilize the computing resources, as many instances as possible are executed in parallel). For the three non-learning algorithms, their source codes currently do not support running on GPU, thus we re-run them on one Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz (with 8 cores), and list the results just for indicative purposes.
Software Dependencies No The paper mentions implementation languages ('Python', 'C++') but does not specify versions for these languages or any specific software libraries, frameworks, or solvers.
Experiment Setup Yes As described before, our method relies on six hyper parameters (m, ω, α, β, H and T). For parameter m which controls the size of the pre-trained model, we set m = 20 for the small instances of data set 1, and set m = 50 for the large instances of data set 2. For the following four parameters, we uniformly choose ω = 5, α = 1, β = 10, H = 10n as the default settings. Finally, for parameter T which controls the termination time, we respectively set T = 10n and T = 40n milliseconds for each instance of data set 1 and data set 2, to ensure that our algorithm elapses no more time than the best (in terms of solution quality) learning algorithm proposed in each reference paper.