Leveraging Partial Symmetry for Multi-Agent Reinforcement Learning
Authors: Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Simin Li, Shuhao Liao, Wenjun Wu
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted to demonstrate the superior performance of the proposed framework over baselines. Finally, we implement the proposed framework in real-world multi-robot testbed to show its superiority. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Beihang University, Beijing, China 2Institute of Artiļ¬cial Intelligence, Beihang University, Beijing, China |
| Pseudocode | No | The paper describes its framework and references existing algorithms, but it does not provide any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Video demonstrations and Supplementary materials are available at the project website https://xinyu-site.github.io/PSE/. |
| Open Datasets | No | The paper refers to tasks like Predator-Prey and Cooperative Navigation as 'classic scenarios implemented in multi-agent particle environment (Mordatch and Abbeel 2017)' and Wildlife Monitoring as a 'grid-world-based environment (van der Pol et al. 2020)'. It also mentions the 'Webots simulator' for Formation Change. While these are established environments, the paper does not provide concrete access information (e.g., specific links, DOIs, or file repositories) for any *datasets* used for training, beyond citing the papers that describe the environments. |
| Dataset Splits | No | The paper does not explicitly provide details about specific training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper mentions that policies were deployed on 'Epuck' robots and that simulations were run in 'Webots', but it does not specify the hardware used for training the models or running the simulations (e.g., specific CPU or GPU models). |
| Software Dependencies | No | The paper mentions the use of the 'Webots simulator' and various MARL algorithms (MADDPG, QMIX, MAPPO) but does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9). |
| Experiment Setup | No | The paper mentions that 'The performance of each algorithm was evaluated with 10 different random seeds' and discusses general aspects of the experimental setup like noise levels, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations. |