reproducibilityindex.ai

Hybrid Learning for Multi-agent Cooperation with Sub-optimal Demonstrations

Authors: Peixi Peng, Junliang Xing, Lili Cao

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed approach on a real-time strategy combat game. Experimental results show that the approach outperforms many competing demonstration-based methods.
Researcher Affiliation	Academia	Peixi Peng1,2 , Junliang Xing1 and Lili Cao1 1 Institute of Automation, Chinese Academy of Sciences 2 Peking University
Pseudocode	Yes	Algorithm 1: The best response dynamics algorithm. Algorithm 2: The proposed learning algorithm.
Open Source Code	No	The paper does not provide any specific links, explicit statements, or references to supplementary material for open-source code for the methodology described.
Open Datasets	Yes	Our approach is tested using Spar Craft [Churchill et al., 2012], which is a simulator of the Star Craft local combat game and is widely adopted to test AL algorithms [Churchill and Buro, 2013; Lelis, 2017; Moraes and Lelis, 2018]. an additional experiment is conducted on the trafﬁc junction task [Sukhbaatar and Fergus, 2016].
Dataset Splits	No	The paper mentions a 'cross-validation setting' but does not provide specific details on the dataset splits (e.g., percentages or sample counts for training, validation, and test sets).
Hardware Specification	Yes	The models are trained on Ge Force GTX 1080 and tested on a PC with one 2.4 GHz CPU and 8G RAM.
Software Dependencies	No	The paper mentions 'SGD' as an optimizer but does not specify any software libraries, frameworks, or their version numbers used in the implementation.
Experiment Setup	Yes	The iterations of Alg. 1, EUpdate of Alg. 2 and λ are set to 7, 500 and 0.995, respectively. The mean win rates and terminal hit points reward 1 over 100 battles are used as evaluation metrics. All networks are optimized by SGD with learning rate 10 3.