reproducibilityindex.ai

Off-Policy Deep Reinforcement Learning without Exploration

Authors: Scott Fujimoto, David Meger, Doina Precup

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present the ﬁrst continuous control deep reinforcement learning algorithm which can learn effectively from arbitrary, ﬁxed batch data, and empirically demonstrate the quality of its behavior in several tasks.
Researcher Affiliation	Academia	1Department of Computer Science, Mc Gill University, Montreal, Canada 2Mila Qu ebec AI Institute.
Pseudocode	Yes	Algorithm 1 BCQ
Open Source Code	Yes	To ensure reproducibility, we provide precise experimental and implementation details, and our code is made available (https://github.com/sfujim/BCQ).
Open Datasets	Yes	Our practical experiments examine three different batch settings in Open AI gym s Hopper-v1 environment (Todorov et al., 2012; Brockman et al., 2016)
Dataset Splits	No	The paper describes data collection and usage in different batch settings but does not explicitly provide train/validation/test dataset splits or cross-validation details.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using 'Open AI gym' and 'Mu Jo Co environments' but does not provide specific version numbers for these or any other ancillary software dependencies.
Experiment Setup	Yes	Exact implementation and experimental details are provided in the Supplementary Material.