reproducibilityindex.ai

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

Authors: Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We prove that CR-OMLE achieves a regret of O(T + C), where C denotes the cumulative corruption level after T episodes. We also prove a lower bound to show that the additive dependence on C is optimal. To the best of our knowledge, this is the ﬁrst work on corruption-robust model-based RL algorithms with provable guarantees.
Researcher Affiliation	Academia	1The Hong Kong University of Science and Technology. 2University of California, Los Angeles. 3University of Illinois Urbana-Champaign.
Pseudocode	Yes	Algorithm 1 Corruption-Robust Optimistic MLE (CR-OMLE), Algorithm 2 Corruption-Robust Pessimistic MLE (CR-PMLE), Algorithm 3 Uncertainty Weight Iteration
Open Source Code	No	The paper does not provide any statement about making its code open source or links to code repositories.
Open Datasets	No	The paper is theoretical and does not conduct experiments on specific datasets, nor does it provide any concrete access information for any datasets.
Dataset Splits	No	The paper is theoretical and does not involve experimental validation with dataset splits. It refers to 'train', 'validation', and 'test' as general concepts in RL, not as specific data partitions used in its own work.
Hardware Specification	No	The paper focuses on theoretical analysis and algorithm design and does not report on experiments that would require hardware specifications.
Software Dependencies	No	The paper is theoretical and does not report on experiments requiring specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and focuses on algorithm design and proofs, therefore it does not provide specific experimental setup details such as hyperparameter values or training configurations.