reproducibilityindex.ai

Hybrid RL: Using both offline and online data can make RL efficient

Authors: Yuda Song, Yifei Zhou, Ayush Sekhari, Drew Bagnell, Akshay Krishnamurthy, Wen Sun

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we discuss empirical results comparing Hy-Q to several representative RL methods on two challenging benchmarks.
Researcher Affiliation	Collaboration	Yuda Song Carnegie Mellon University Yifei Zhou Cornell University Ayush Sekhari MIT J. Andrew Bagnell Carnegie Mellon University Akshay Krishnamurthy Microsoft Research Wen Sun Cornell University
Pseudocode	Yes	Algorithm 1 Hybrid Q-learning using both offline and online data (Hy-Q)
Open Source Code	Yes	We also open source our code at https://github.com/yudasong/Hy Q.
Open Datasets	No	Our (offline) dataset can be reproduced with the attached instructions, and our results could be reproduced with the given random seeds.
Dataset Splits	No	The paper does not provide specific percentages or counts for training, validation, or test dataset splits. It discusses training and evaluation but not explicit data partitioning.
Hardware Specification	Yes	We run our experiments on a cluster of computes with Nvidia RTX 3090 GPUs and various CPUs which do not incur any randomness to the results.
Software Dependencies	No	The paper mentions tools like 'Adam' optimizer and implies 'PyTorch' use through a GitHub link, but it does not list specific software dependencies with version numbers.
Experiment Setup	Yes	We provide the hyperparameters of Hy-Q in Table. 1. In addition, we provide the hyperparameters we tried for CQL baseline in Table. 2.