reproducibilityindex.ai

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

Authors: James MacGlashan, Evan Archer, Alisa Devlic, Takuma Seno, Craig Sherstan, Peter Wurman, Peter Stone

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark SAC-D, SAC-D-CAGrad and SAC-D-Naive against SAC on a selection of continuousaction Gym [7] environments. For each environment, we exposed existing additive reward components without altering the behavior of the environments or their composite rewards.
Researcher Affiliation	Collaboration	James Mac Glashan james.macglashan@sony.com Evan Archer evan.archer@sony.com Alisa Devlic alisa.devlic@sony.com Takuma Seno takuma.seno@sony.com Craig Sherstan craig.sherstan@sony.com Peter R. Wurman peter.wurman@sony.com Peter Stone pstone@cs.utexas.edu ... Sony AI Equal contribution The University of Texas at Austin
Pseudocode	Yes	Algorithm 1 SAC-D and SAC-D-CAGrad Update
Open Source Code	No	At present, we are unable to release our source code or data.
Open Datasets	Yes	We benchmark SAC-D, SAC-D-CAGrad and SAC-D-Naive against SAC on a selection of continuousaction Gym [7] environments.
Dataset Splits	No	The paper states: 'As outlined in App. C, we used hyperparameters previously published for use with SAC [14] for all experiments.' While it refers to hyperparameters and training, it does not explicitly provide specific train/validation/test dataset splits with percentages or sample counts in the main text.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. While the ethics checklist indicates that resources used are mentioned, these details are not found within the main paper content or appendices.
Software Dependencies	No	The paper mentions 'Gym [7]' and 'the rliable framework [2]', but it does not list specific version numbers for key software dependencies such as Python, PyTorch/TensorFlow, or other libraries used for implementation.
Experiment Setup	No	The paper states, 'As outlined in App. C, we used hyperparameters previously published for use with SAC [14] for all experiments.' This indicates that specific experimental setup details, such as hyperparameter values, are deferred to an appendix rather than being presented explicitly in the main text.