Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning

Authors: Alexey Skrynnik, Anton Andreychuk, Maria Nesterova, Konstantin Yakovlev, Aleksandr Panov

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on a wide range of setups comparing it to the state-of-the-art solvers. The results show that our method consistently outperforms the learnable competitors, showing higher throughput and better ability to generalize to the maps that were unseen at the training stage.
Researcher Affiliation Academia Alexey Skrynnik1,2, Anton Andreychuk1, Maria Nesterova2,3, Konstantin Yakovlev2,1, Aleksandr Panov1,3 1AIRI, Moscow, Russia 2Federal Research Center for Computer Science and Control of Russian Academy of Sciences, Moscow, Russia 3MIPT, Dolgoprudny, Russia
Pseudocode No The paper describes algorithms and their components but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/AIRI-Institute/learn-to-follow.
Open Datasets Yes We use the readily available weights for PRIMAL2 and SCRIMP neural networks from the authors repositories. PICO was trained by us using the open-source code of its authors. Our solvers, FOLLOWER and FOLLOWERLITE, were trained on the maze-like maps only. ...Now we run an additional evaluation where we compare FOLLOWER, FOLLOWERLITE, PRIMAL2 and SCRIMP on two (unseen during learning) maps from the well-known in the MAPF community Moving AI benchmark (Stern et al. 2019): den520d and Paris 1.
Dataset Splits No The paper describes training and evaluation on different sets of maps, but does not provide specific percentages or counts for training/validation/test splits within a single dataset, nor does it specify cross-validation settings.
Hardware Specification Yes Upon fixing the parameters, the final policy of FOLLOWER is trained for 1 billion steps using a single NVIDIA A100 in approximately 18 hours. FOLLOWERLITE is trained for 20 million steps with a single NVIDIA TITAN RTX GPU in approximately 30 minutes.
Software Dependencies No The paper mentions software components like 'POGEMA2 environment', 'Res Net', 'GRU', and 'A*', but it does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes For training the episode length was set to 512. The agent s field-of-view was 11 11, the number of agents varied in range: 128, 256. The reward r was a small positive number, i.e. r = 0.01. More details about tuning the hyperparameters are reported in the Arxiv version of the paper.