reproducibilityindex.ai

A Connection between One-Step RL and Critic Regularization in Reinforcement Learning

Authors: Benjamin Eysenbach, Matthieu Geist, Sergey Levine, Ruslan Salakhutdinov

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	While our theoretical results require assumptions (e.g., deterministic dynamics), our experiments nevertheless show that our analysis makes accurate, testable predictions about practical offline RL methods (CQL and one-step RL) with commonly-used hyperparameters.
Researcher Affiliation	Collaboration	1Google Research 2Carnegie Mellon University 3UC Berkeley.
Pseudocode	No	The paper describes algorithms and updates but does not provide structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Code for the tabular experiments is available online. Code: https://github.com/ben-eysenbach/ac-connection
Open Datasets	Yes	we will repeat our experiments on four datasets from the D4RL benchmark (Fu et al., 2020).
Dataset Splits	No	The paper mentions using datasets for experiments but does not explicitly detail training, validation, and test splits (e.g., percentages or sample counts for each split).
Hardware Specification	No	The paper describes experimental setups and environments (e.g., gridworld, D4RL benchmark) but does not specify any hardware details such as GPU/CPU models, memory, or specific computing platforms used for running the experiments.
Software Dependencies	No	The paper mentions using the "implementation of one-step RL (reverse KL) and CQL provided by Hoffman et al. (2020)", which refers to another paper's implementation. However, it does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup	Yes	We use γ = 0.95 and train for 20k full-batch updates, using a learning rate of 1e-2. The Q table is randomly initialized using a standard normal distribution.