reproducibilityindex.ai

Subspace-Aware Exploration for Sparse-Reward Multi-Agent Tasks

Authors: Pei Xu, Junge Zhang, Qiyue Yin, Chao Yu, Yaodong Yang, Kaiqi Huang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Under the sparse-reward setting, we show that the proposed algorithm signiﬁcantly outperforms the state-of-the-art algorithms in the multiple-particle environment, the Google Research Football and Star Craft II micromanagement tasks.
Researcher Affiliation	Academia	1School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences 2CRISE, Institute of Automation, Chinese Academy of Sciences 3CAS, Center for Excellence in Brain Science and Intelligence Technology 4School of Computer Science and Engineering, Sun Yat-sen University 5Beijing Institute for General AI 6Institute for AI, Peking University xupei2018@ia.ac.cn, {jgzhang,qyyin,kqhuang}@nlpr.ia.ac.cn yuchao3@mail.sysu.edu.cn, yaodong.yang@pku.edu.cn
Pseudocode	No	The paper describes algorithms and formulations but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper does not provide a link to open-source code for the methodology or explicitly state that the code is publicly available.
Open Datasets	Yes	We evaluate SAME on three challenging environments: a discrete version of the multiple-particle environment (MPE) (Wang et al. 2019), the Google Research Football (GRF) (Kurach et al. 2020) and Star Craft II micromanagement (SMAC) (Samvelyan et al. 2019).
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. It mentions using specific environments for evaluation in a reinforcement learning context but no data splits in the traditional supervised learning sense.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments. It mentions using 'continuous state space' for SMAC and GRF, but no hardware specifics.
Software Dependencies	No	The paper mentions using RND (Burda et al. 2019b) for calculating `bfull` and states that experiments follow training settings of CDS (Chenghao et al. 2021) and use TD(λ). However, it does not provide specific version numbers for these or other software libraries/dependencies.
Experiment Setup	Yes	To calculate ˆbwt sub in the continuous state space (such as SMAC and GRF), we discretize each dimension of the state space into B equally spaced atomic states. ... All experiments run with ﬁve random seeds. Details for environments and training are given in Appendix.