Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL

Authors: Peng Cheng, Xianyuan Zhan, zhihao wu, Wenjia Zhang, Youfang Lin, Shou cheng Song, Han Wang, Li Jiang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Based on extensive experiments, we find TSRL achieves great performance on small benchmark datasets with as few as 1% of the original samples, which significantly outperforms the recent offline RL algorithms in terms of data efficiency and generalizability.
Researcher Affiliation Academia Peng Cheng 1,3, Xianyuan Zhan 2,4, Zhihao Wu 1,3, Wenjia Zhang2, Shoucheng Song1,3, Han Wang1,3, Youfang Lin1,3, Li Jiang2 1 Beijing Jiaotong University, Beijing, China 2 Tsinghua University, Beijing, China 3 Beijing Key Laboratory of Traffic Data Analysis and Mining, Beijing, China 4 Shanghai Artificial Intelligence Laboratory, Shanghai, China
Pseudocode Yes Algorithm 1 T-Symmetry Regularized Offline RL (TSRL)
Open Source Code Yes Code is available at: https://github.com/pcheng2/TSRL
Open Datasets Yes We evaluate TSRL on the D4RL Mu Jo Co-v2 and Adroit-v1 benchmark datasets [5]
Dataset Splits Yes We compare the performance of TSRL and the baseline methods on both the full D4RL datasets and their reduced-size datasets with only 5k 10k samples, which are constructed by randomly sampling a given fraction of trajectories in the full datasets*.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments.
Software Dependencies No The paper mentions software components like 'Adam optimizer', 'Re LU activation', 'Pytorch', 'Functorch', and 'Jax' but does not specify their version numbers.
Experiment Setup Yes Table 3: Hyperparameter details for TDM and TSRL