Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

PARCO: Parallel AutoRegressive Models for Multi-Agent Combinatorial Optimization

Authors: Federico Berto, Chuanbo Hua, Laurin Luttmann, Jiwoo Son, Junyoung Park, Kyuree Ahn, Changhyun Kwon, Lin Xie, Jinkyoo Park

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate PARCO in multi-agent vehicle routing and scheduling problems, where our approach outperforms state-of-the-art learning methods, demonstrating strong generalization ability and remarkable computational efficiency.
Researcher Affiliation Collaboration Federico Berto ,1,2,3, Chuanbo Hua ,1,2, Laurin Luttmann ,4, Jiwoo Son2, Junyoung Park1, Kyuree Ahn2, Changhyun Kwon1,2, Lin Xie5, Jinkyoo Park1,2 1KAIST 2Omelet 3Radical Numerics 4Leuphana University 5Brandenburg University of Technology AI4CO
Pseudocode Yes Algorithm 1 Priority-based Conflict Handler Require: Actions a NM, Priorities p RM, Fallback actions r RM Ensure: Resolved Actions a NM 1: σ argsort(p, descending = True) // Sort indices based on priorities in descending order 2: ˆa a[σ] // Reorder actions according to priority 3: C 0M // Initialize conflict mask 4: for i = 2 to M do // Check for conflicts in reordered actions 5: if ˆai {ˆa1, . . . , ˆai 1} then 6: Ci 1 // Ci = 1 indicates a conflict for index i 7: end if 8: end for 9: ˆa (1 C) ˆa + C r // Resolve conflicts by assigning fallback actions 10: a ˆa[σ 1] // Reorder resolved actions back to original order
Open Source Code Yes We make our source code publicly available to foster future research: https://github.com/ai4co/parco.
Open Datasets Yes Testing is performed on the 1280 instances per (N, M) test setting from Liu et al. [63]. We follow the instance generation scheme outlined in Kwon et al. [50]. In Table 3: 2D-Ptr [63] Code/Dataset Available on Github and Mat Net [50] Code/Dataset MIT License.
Dataset Splits No Train data generation Neural baselines were trained with the specific number of nodes N and number of agents M they were tested on. In PARCO, we select a varying size and number of customer training schemes: at each training step, we sample N U(60, 100) and m U(3, 7). ... Testing is performed on the 1280 instances per (N, M) test setting from Liu et al. [63].
Hardware Specification Yes We experiment on a workstation equipped with 2 INTEL(R) XEON(R) GOLD 6338 CPUs and 8 NVIDIA RTX 4090 graphic cards with 24 GB of VRAM each.
Software Dependencies Yes We used Python 3.12, Py Torch 2.5 [78] coupled with Py Torch Lightning [17] with most code based on the RL4CO library [5]. The operating system is Ubuntu 24.04 LTS.
Experiment Setup Yes We train PARCO with RL via Sym NCO [43] with K = 10 symmetric augmentations as shared REINFORCE baseline for 100 epochs using the Adam optimizer [46] with a total batch size 512 (using 4 GPUs in Distributed Data Parallel configuration) and an initial learning rate of 10 4 with a step decay factor of 0.1 after the 80th and 95th epochs. For each epoch, we sample 4 105 randomly generated data. Training takes around 15 hours in our configuration.