reproducibilityindex.ai

Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space

Authors: Zhou Fan, Rui Su, Weinan Zhang, Yong Yu

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments test H-PPO on a collection of tasks with parameterized action space, where H-PPO demonstrates superior performance over previous methods of parameterized action reinforcement learning.
Researcher Affiliation	Academia	Zhou Fan , Rui Su , Weinan Zhang and Yong Yu Shanghai Jiao Tong University zhou.fan@sjtu.edu.cn, {surui, wnzhang, yyu}@apex.sjtu.edu.cn
Pseudocode	No	The paper describes the proposed algorithms but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access to source code (no specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets	No	The paper states 'We create a collection of tasks with parameterized action space for the experiments' and provides a link to supplemental material for environment descriptions, but it does not mention using or providing access to any publicly available or open datasets with specific links or citations.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	The networks in the four algorithms are of the same size, and the hidden layer sizes for each network is (256, 256, 128, 64). The replay buffer size for DDPG and DQN is 10000, and the batch size for sampling is 32.