reproducibilityindex.ai

Fourier Policy Gradients

Authors: Matthew Fellows, Kamil Ciosek, Shimon Whiteson

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	While the main contribution of this paper is theoretical, we also provide an empirical evaluation using a periodic critic on a simple turntable problem that demonstrates the practical benefit of using a trigonometric critic.We evaluated a periodic critic of this form on a toy turntable domain where the goal is to rotate a ﬂat record to the desired position by rotating it (see Appendix D for details). We compared it to the DPG baseline from Open AI (Dhariwal et al., 2017), which uses a neural network based critic capable of addressing complex control tasks. As expected, the learning curves in Figure 1 show that using a periodic critic (F-EPG) leads to faster learning, because it encodes more information about the action space than a generic neural network.
Researcher Affiliation	Academia	Matthew Fellows * 1 Kamil Ciosek * 1 Shimon Whiteson 1 1Department of Computer Science, University of Oxford, United Kingdom. Correspondence to: Matthew Fellows <matthew.fellows@cs.ox.ac.uk>.
Pseudocode	Yes	Algorithm 1 Expected Policy Gradient
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available.
Open Datasets	No	We evaluated a periodic critic of this form on a toy turntable domain where the goal is to rotate a ﬂat record to the desired position by rotating it (see Appendix D for details).(Explanation: The paper mentions a "toy turntable domain" but does not provide any access information (link, DOI, formal citation) for this environment or associated data as a publicly available dataset.)
Dataset Splits	No	The paper does not provide specific details on dataset splits (e.g., train/validation/test percentages or counts) for reproducibility.
Hardware Specification	No	The paper does not provide specific details on the hardware used to run the experiments (e.g., GPU/CPU models, memory, or cloud computing specifications).
Software Dependencies	No	The paper mentions using 'OpenAI Baselines' but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	No	The paper describes the 'turntable domain' and compares with a DPG baseline, but it does not explicitly provide hyperparameter values (e.g., learning rate, batch size, optimizer settings) or detailed training configurations for the experiments.