reproducibilityindex.ai

Toward Efficient Gradient-Based Value Estimation

Authors: Arsalan Sharifnassab, Richard S. Sutton

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results on a few classic control environments with neural network function approximation show signiﬁcant improvement over RG, and achieving competitive performance to TD.
Researcher Affiliation	Academia	1Authors are with the Department of Computing Science, University of Alberta, Canada.
Pseudocode	Yes	Algorithm 1 RAN
Open Source Code	No	The paper does not provide any statement or link indicating the release of open-source code for the methodology.
Open Datasets	Yes	We ran an experiment on classic control tasks Acrobot and Cartpole to test the performance of the RANS algorithm. ... In another experiment, we evaluated the performance of RANS on simple Mu Jo Co environments Hopper and Half Cheetah.
Dataset Splits	No	The paper describes an online learning setting where samples are directly fed to training algorithms, rather than using traditional predefined validation splits.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, memory) used to run its experiments.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'neural network', but does not specify any software libraries or dependencies with version numbers (e.g., TensorFlow, PyTorch versions).
Experiment Setup	Yes	The parameters used in the experiments are as follows. For RAN, we set α = 0.025, β = 0.4, and λ = 0.9998. For RG and TD(0) we used α = 0.5. ... For TD(0), we used softmax coefﬁcient 1 and Adam optimizer with step-size 0.005. ... For RANS, we set α = 0.001 and all other parameters were set to their default values described in Algorithm 4.