reproducibilityindex.ai

Ensemble Bootstrapping for Q-Learning

Authors: Oren Peer, Chen Tessler, Nadav Merlis, Ron Meir

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that there exist domains where both over and under-estimation result in sub-optimal performance. Finally, We demonstrate the superior performance of a deep RL variant of EBQL over other deep QL algorithms for a suite of ATARI games. In this section, we present two main experimental results of EBQL compared to QL and DQL in both a tabular setting and on the ATARI ALE (Bellemare et al., 2013) using the deep RL variants.
Researcher Affiliation	Academia	1Viterbi Faculty of Electrical Engineering, Technion Institute of Technology, Haifa, Israel.
Pseudocode	Yes	Algorithm 1 Ensemble Bootstrapped Q-Learning (EBQL)
Open Source Code	No	The paper does not provide a direct link to the source code for the methodology described in this paper or explicitly state its availability.
Open Datasets	Yes	Here, we evaluate EBQL in a high dimensional task ATARI ALE (Bellemare et al., 2013).
Dataset Splits	No	The paper mentions training over 50M steps but does not specify dataset splits (e.g., percentages or counts for training, validation, and test sets) for data partitioning.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers needed to replicate the experiments.
Experiment Setup	No	All hyper-parameters are identical to the baselines, as reported in (Mnih et al., 2015), including the use of target networks (see Appendix E for pseudo-codes).