Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation
Authors: Qiang He, Tianyi Zhou, Meng Fang, Setareh Maghsudi
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first validate the effectiveness of automatic control of rank on illustrative experiments. Then, we scale up BEER to complex continuous control tasks by combining it with the deterministic policy gradient method. Among 12 challenging Deep Mind control tasks, BEER outperforms the baselines by a large margin. |
| Researcher Affiliation | Academia | 1Ruhr University Bochum, 2University of Maryland, College Park, 3University of Liverpool |
| Pseudocode | Yes | We summarize BEER based on DPG in Algorithm 1. |
| Open Source Code | Yes | Our code is available at https://github.com/sweetice/BEER-ICLR2024. |
| Open Datasets | Yes | DMControl (Tunyasuvunakool et al., 2020) serves as a standard benchmark suite for evaluating the capabilities of DRL agents in complex, continuous control tasks. |
| Dataset Splits | No | The paper describes evaluation protocols in terms of episodes and timesteps within reinforcement learning environments, but it does not specify explicit training/validation/test dataset splits in terms of percentages or sample counts for static datasets. |
| Hardware Specification | Yes | This server is equipped with 8 Ge Force 2080 Ti GPUs and has 70 CPU logical cores... Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz and NVIDIA Ge Force RTX 2080 Ti GPUs. |
| Software Dependencies | No | The paper mentions using 'Py Torch, Numpy, Gym, Random, and CUDA' for deterministic controls but does not provide specific version numbers for these libraries. |
| Experiment Setup | Yes | Comprehensive details of the experimental setup appear in Appendix C. The paper includes Table 2, Table 3, and Table 4 which list detailed hyper-parameters settings for Lunar Lander, Grid World, and DMControl experiments, respectively. |