Regularized Q-Learning
Authors: Han-Dong Lim, Donghwan Lee
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we briefly present the experimental results under well-known environments in Tsitsiklis and Van Roy [1996], Baird [1995], where Q-learning with linear function approximation diverges. As from Figure 2b, our algorithm shows faster convergence rate than other algorithms. |
| Researcher Affiliation | Academia | Han-Dong Lim Electrical Engineering, KAIST limaries30@kaist.ac.kr Donghwan Lee Electrical Engineering, KAIST donghwan@kaist.ac.kr |
| Pseudocode | Yes | The pseudo-code is given in Appendix A.16. |
| Open Source Code | Yes | We have attached the code in the supplementary files. |
| Open Datasets | Yes | In this section, we briefly present the experimental results under well-known environments in Tsitsiklis and Van Roy [1996], Baird [1995], where Q-learning with linear function approximation diverges. |
| Dataset Splits | No | No explicit validation set or train/validation/test splits are mentioned in the paper for reproducing data partitioning. |
| Hardware Specification | No | Our experiments can simply run on normal computer because we do not require any heavy computation including using GPU. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library names with specific versions) are mentioned in the paper. |
| Experiment Setup | Yes | Learning rate for Greedy GQ (GGQ) and Coupled Q Learning (CQL), which have two learning rates, are set as 0.05 and 0.25, respectively... For Reg Q, we set the learning rate as 0.25, and the weight η as two. |