reproducibilityindex.ai

Policy Gradient Method For Robust Reinforcement Learning

Authors: Yue Wang, Shaofeng Zou

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we provide simulation results to demonstrate the robustness of our methods.
Researcher Affiliation	Academia	1Department of Electrical Engineering, University at Buffalo, New York, USA. Correspondence to: Shaofeng Zou <szou3@buffalo.edu>.
Pseudocode	Yes	Algorithm 1 Robust Policy Gradient
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the authors' source code for the described methodology is publicly available.
Open Datasets	Yes	We test our algorithms on the Garnet problem (Archibald et al., 1995) and the Taxi environment from Open AI (Brockman et al., 2016).
Dataset Splits	No	The paper describes testing algorithms on environments (Garnet, Taxi) but does not provide specific dataset split information (e.g., percentages or sample counts for training, validation, and test sets) for data partitioning.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running experiments, such as exact GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions general software concepts like 'neural network parameterized policy' but does not provide specific software dependency details with version numbers (e.g., library names with specific versions).
Experiment Setup	Yes	In this section, we consider Garnet problem G(30, 20) using neural network parameterized policy, where we use a two-layer neural network with 15 neurons in the hidden layer to parameterize the policy πθ. We then use a two-layer neural network (with 20 neurons in the hidden layer) in the critic.