reproducibilityindex.ai

Corruption-Robust Offline Reinforcement Learning with General Function Approximation

Authors: Chenlu Ye, Rui Yang, Quanquan Gu, Tong Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Motivated by our theoretical ﬁndings, we present a practical ofﬂine RL algorithm with uncertainty weighting and demonstrate its efﬁcacy under diverse data corruption scenarios. Our practical implementation achieves a 104% improvement over the previous state-of-the-art uncertainty-based ofﬂine RL algorithm under data corruption, demonstrating its potential for effective deployment in real-world applications. 5 Experiments Based on our theoretical results, we propose a practical implementation for CR-PEVI and verify its effectiveness on simulation tasks with corrupted ofﬂine data.
Researcher Affiliation	Academia	Chenlu Ye The Hong Kong University of Science and Technology cyeab@connect.ust.hk Rui Yang The Hong Kong University of Science and Technology ryangam@connect.ust.hk Quanquan Gu University of California, Los Angeles qgu@cs.ucla.edu Tong Zhang The Hong Kong University of Science and Technology tongzhang@ust.hk
Pseudocode	Yes	Algorithm 1 Uncertainty Weight Iteration... Algorithm 2 CR-PEVI
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We assess the performance of our approach using continuous control tasks from [15]... [15] Fu, J., Kumar, A., Nachum, O., Tucker, G., and Levine, S. (2020). D4rl: Datasets for deep data-driven reinforcement learning. ar Xiv preprint ar Xiv:2004.07219.
Dataset Splits	No	No explicit details on train/validation/test dataset splits (e.g., percentages, sample counts) or the use of cross-validation are provided in the paper.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, cloud instance types) used to run the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies, libraries, or solvers with version numbers.
Experiment Setup	Yes	The ensemble size K is set to 10 for all experiments. For evaluation, we report average returns with standard deviations over 10 random seeds. More implementation details are also provided in Appendix D.