reproducibilityindex.ai

CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning

Authors: Sheng Yue, Guanbo Wang, Wei Shao, Zhaofeng Zhang, Sen Lin, Ju Ren, Junshan Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments corroborate the significant performance gains of CLARE over existing state-of-the-art algorithms on Mu Jo Co continuous control tasks (especially with a small offline dataset), and the learned reward is highly instructive for further learning (source code).
Researcher Affiliation	Academia	Sheng Yue1 , Guanbo Wang2, Wei Shao3 , Zhaofeng Zhang4, Sen Lin5 , Ju Ren1,6 , Junshan Zhang3 1Tsinghua University, 2Tongji University, 3University of California, Davis, 4Arizona State University, 5Ohio State University, 6Zhongguancun Laboratory
Pseudocode	Yes	Algorithm 1: Conservative model-based reward learning (CLARE)
Open Source Code	Yes	Extensive experiments corroborate the significant performance gains of CLARE over existing state-of-the-art algorithms on Mu Jo Co continuous control tasks (especially with a small offline dataset), and the learned reward is highly instructive for further learning (source code)." and "Our implementation is built upon the open source framework of offline RL algorithms, provided at: https://github.com/polixir/Offline RL
Open Datasets	Yes	we compare CLARE with the following existing offline IRL methods on the D4RL benchmark (Fu et al., 2020)" and "the D4RL dataset provided at: https: //github.com/rail-berkeley/d4rl (under the Apache License 2.0).
Dataset Splits	No	The paper mentions picking dynamics models based on 'validation prediction error on a held-out set' but does not explicitly provide the train/validation/test splits (percentages, counts, or explicit standard splits) for the main D4RL datasets used in their experiments.
Hardware Specification	Yes	We implement the code in Py Torch 1.11.0 on a server with a 32-Cores AMD Ryzen Threadripper PRO 3975WX and a Intel Ge Forch RTX 3090 Ti.
Software Dependencies	Yes	We implement the code in Py Torch 1.11.0
Experiment Setup	Yes	Appendix A.2 HYPERPARAMETERS" and "Table 2: Hyperparameters for CLARE." which lists specific values for learning rates, batchsize, horizon, regularization weight, discount factor, and number of steps/epochs.