reproducibilityindex.ai

Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces

Authors: Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results on several challenging tasks (simulated Robo Cup Soccer and game Ghost Story) show that both Deep MAPQN and Deep MAHHQN are effective and signiﬁcantly outperform existing independent deep parameterized Q-learning method.
Researcher Affiliation	Collaboration	Haotian Fu1 , Hongyao Tang1 , Jianye Hao1 , Zihan Lei2 , Yingfeng Chen2 , Changjie Fan2 1College of Intelligence and Computing, Tianjin University 2Fuxi AI Lab in Netease {haotianfu, bluecontra, jianye.hao}@tju.edu.cn, {leizihan, chenyingfeng1, fanchangjie}@corp.netease.com
Pseudocode	No	The paper describes the steps of the algorithms in paragraph form and through equations, but does not include a dedicated pseudocode block or algorithm listing.
Open Source Code	No	The paper provides links to supplementary material regarding mixing network structure and experimental settings, and a video of learned policies, but does not explicitly state that the source code for their methodology is open-source or provide a link to it.
Open Datasets	Yes	In this section, we evaluate our algorithms in 1) the standard benchmark game HFO, 2) 3v3 mode in a large-scale online video game Ghost Story. Half ﬁeld Offense (HFO) is an abstraction of full Robo Cup 2D game. Previous work [Hausknecht and Stone, 2016; Wang et al., 2018; Wei et al., 2018b] applied RL to the single-agent version of HFO...A full list of state information can be found at the ofﬁcial website https://github.com/mhauskn/ HFO/blob/master/doc/manual.pdf.
Dataset Splits	No	The paper discusses training and execution phases, but does not provide specific details on training/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification	Yes	the actual training time of Deep MAPQN is about three days while Deep MAHHQN takes less than one day to train on the same NVidia Geforce GTX 1080Ti GPU.
Software Dependencies	No	The paper does not specify any software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers.
Experiment Setup	No	The paper describes reward functions and training coordination between high-level and low-level networks, but does not provide specific hyperparameters such as learning rate, batch size, or optimizer settings.