reproducibilityindex.ai

MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

Authors: Jeewon Jeon, Woojun Kim, Whiyoung Jung, Youngchul Sung

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical results show that MASER significantly outperforms Star Craft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.
Researcher Affiliation	Academia	1School of Electrical Engineering, KAIST, Daejeon, South Korea.
Pseudocode	No	The paper describes its algorithm using prose and mathematical equations but does not include a formal pseudocode block or an 'Algorithm' section.
Open Source Code	Yes	The source code of the proposed algorithm is available at https://github.com/Jiwonjeon9603/MASER.
Open Datasets	Yes	To evaluate MASER, we considered the widely-used Star Craft II micromanagement benchmark (SMAC) environment.
Dataset Splits	No	The paper mentions using different random seeds for experiments and provides hyperparameters but does not explicitly describe data splits for training, validation, and testing (e.g., 80/10/10 split or specific counts).
Hardware Specification	Yes	Our code is based on Pytorch and we used NVIDIA-TITAN Xp.
Software Dependencies	No	The paper mentions software components like Pytorch, DRQN, GRU, and RMSProp, but it does not provide specific version numbers for any of them.
Experiment Setup	Yes	The values of hyper-parameters are shown in Table 2. In MASER, the most recent 5000 episodes are stored in the replay buffer, and the mini-batch size is 32. MASER updates the target network every 200 episodes. We used RMSProp for the optimizer and the learning rate of the optimizer is 0.0005. The discounted factor of expected reward (i.e. return) is 0.99. And the value of epsilon for epsilon-greedy Q-learning starts at 1.0 and ends at 0.05 with 50000 anneal time.