Leveraging Partial Symmetry for Multi-Agent Reinforcement Learning

Authors: Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Simin Li, Shuhao Liao, Wenjun Wu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted to demonstrate the superior performance of the proposed framework over baselines. Finally, we implement the proposed framework in real-world multi-robot testbed to show its superiority.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Beihang University, Beijing, China 2Institute of Artificial Intelligence, Beihang University, Beijing, China
Pseudocode No The paper describes its framework and references existing algorithms, but it does not provide any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Video demonstrations and Supplementary materials are available at the project website https://xinyu-site.github.io/PSE/.
Open Datasets No The paper refers to tasks like Predator-Prey and Cooperative Navigation as 'classic scenarios implemented in multi-agent particle environment (Mordatch and Abbeel 2017)' and Wildlife Monitoring as a 'grid-world-based environment (van der Pol et al. 2020)'. It also mentions the 'Webots simulator' for Formation Change. While these are established environments, the paper does not provide concrete access information (e.g., specific links, DOIs, or file repositories) for any *datasets* used for training, beyond citing the papers that describe the environments.
Dataset Splits No The paper does not explicitly provide details about specific training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification No The paper mentions that policies were deployed on 'Epuck' robots and that simulations were run in 'Webots', but it does not specify the hardware used for training the models or running the simulations (e.g., specific CPU or GPU models).
Software Dependencies No The paper mentions the use of the 'Webots simulator' and various MARL algorithms (MADDPG, QMIX, MAPPO) but does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9).
Experiment Setup No The paper mentions that 'The performance of each algorithm was evaluated with 10 different random seeds' and discusses general aspects of the experimental setup like noise levels, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations.