reproducibilityindex.ai

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble

Authors: Gaon An, Seungyong Moon, Jang-Hyun Kim, Hyun Oh Song

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed method on D4RL benchmarks [9] and verify that the proposed method outperforms the previous state-of-the-art by a large margin on various types of environments and datasets.
Researcher Affiliation	Collaboration	Seoul National University1 Neural Processing Research Center2 Deep Metrics3
Pseudocode	Yes	Algorithm 1 Ensemble-Diversiﬁed Actor Critic (EDAC)
Open Source Code	Yes	The code is available online3. 3https://github.com/snu-mllab/EDAC
Open Datasets	Yes	We evaluate our proposed methods against the previous ofﬂine RL algorithms on the standard D4RL benchmark [9].
Dataset Splits	No	The paper uses standard D4RL benchmarks but does not explicitly provide the specific train/validation/test dataset splits (e.g., percentages or sample counts) used for their experiments.
Hardware Specification	Yes	We run our experiments on a single machine with one RTX 3090 GPU
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries used in the implementation.
Experiment Setup	Yes	Appendix B Implementation Details: We use Adam optimizer with learning rates 3e-4 and 1e-4 for Q-functions and actors respectively. The batch size is 256. The networks for both Q-functions and actor are MLPs with two hidden layers of size 256 and ReLU activation.