Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation

Authors: Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We first validate the effectiveness of automatic control of rank on illustrative experiments. Then, we scale up BEER to complex continuous control tasks by combining it with the deterministic policy gradient method. Among 12 challenging Deep Mind control tasks, BEER outperforms the baselines by a large margin.
Researcher Affiliation Academia 1Ruhr University Bochum, 2University of Maryland, College Park, 3University of Liverpool
Pseudocode Yes We summarize BEER based on DPG in Algorithm 1.
Open Source Code Yes Our code is available at https://github.com/sweetice/BEER-ICLR2024.
Open Datasets Yes DMControl (Tunyasuvunakool et al., 2020) serves as a standard benchmark suite for evaluating the capabilities of DRL agents in complex, continuous control tasks.
Dataset Splits No The paper describes evaluation protocols in terms of episodes and timesteps within reinforcement learning environments, but it does not specify explicit training/validation/test dataset splits in terms of percentages or sample counts for static datasets.
Hardware Specification Yes This server is equipped with 8 Ge Force 2080 Ti GPUs and has 70 CPU logical cores... Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz and NVIDIA Ge Force RTX 2080 Ti GPUs.
Software Dependencies No The paper mentions using 'Py Torch, Numpy, Gym, Random, and CUDA' for deterministic controls but does not provide specific version numbers for these libraries.
Experiment Setup Yes Comprehensive details of the experimental setup appear in Appendix C. The paper includes Table 2, Table 3, and Table 4 which list detailed hyper-parameters settings for Lunar Lander, Grid World, and DMControl experiments, respectively.