reproducibilityindex.ai

Fast Efficient Hyperparameter Tuning for Policy Gradient Methods

Authors: Supratik Paul, Vitaly Kurin, Shimon Whiteson

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results across multiple domains and algorithms show that using HOOF to learn these hyperparameter schedules leads to faster learning with improved performance. We evaluate HOOF across a range of simulated continuous control tasks using the Mujoco Open AI Gym environments (Brockman et al., 2016). We repeat all experiments across 10 random starts. In all ﬁgures solid lines represent the median, and shaded regions the quartiles. Similarly all results in tables represent the median.
Researcher Affiliation	Academia	Supratik Paul, Vitaly Kurin, Shimon Whiteson Deptartment of Computer Science University of Oxford {supratik.paul,vitaly.kurin,shimon.whiteson}@cs.ox.ac.uk
Pseudocode	Yes	Algorithm 1 HOOF
Open Source Code	Yes	Details about all hyperparameters can be found in the appendices, and code is available at https://github. com/supratikp/HOOF.
Open Datasets	Yes	To experimentally validate HOOF, we apply it to four simulated continuous control tasks from Mu Jo Co Open AI Gym (Brockman et al., 2016): Half Cheetah, Hopper, Ant, and Walker.
Dataset Splits	No	The paper refers to 'training run' and 'samples' but does not provide specific numerical train/validation/test dataset splits (e.g., percentages or sample counts).
Hardware Specification	No	The paper states: 'The experiments were made possible by a generous equipment grant from NVIDIA.' While this indicates the brand of hardware, it does not specify any particular GPU model, CPU type, or other detailed hardware specifications.
Software Dependencies	No	The paper mentions software components like 'A2C', 'Open AI Baselines', 'RMSProp', 'ADAM', and 'SGD' as optimizers, but it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	The paper states, 'Details about all hyperparameters can be found in the appendices,' which indicates that specific hyperparameter values (e.g., learning rate α, GAE parameters γ and λ, KL constraint ϵ) are provided within the paper's full content. It also mentions specific settings like 'ϵ = 0.03' for HOOF with A2C and discusses tuning hyperparameters like α0 and β for meta-gradients.