Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent
Authors: Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei Zhang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on a linear quadratic regulator problem show that the two proposed Q-learning algorithms outperform the vanilla Q-learning with SGD updates. The two algorithms also exhibit significantly better performance than the DQN learning method over a batch of Atari 2600 games. |
| Researcher Affiliation | Academia | Bowen Weng 1 , Huaqing Xiong 1 , Yingbin Liang1 and Wei Zhang 2 1Electrical and Computer Engineering, The Ohio State University, Columbus, OH, USA. 2Mechanical and Energy Engineering, Southern University of Science and Technology, China. |
| Pseudocode | Yes | Algorithm 1 Q-AMSGrad ... Algorithm 2 Q-AMSGrad R |
| Open Source Code | No | Detailed proofs and more experimental results will be available in the extended version of this paper on ar Xiv.org after the official publication of IJCAI Proceedings. |
| Open Datasets | Yes | We then use the Atari 2600 games [Brockman et al., 2016], a classic benchmark for DQN evaluations, to demonstrate the effectiveness of the Q-learning algorithms for complicated tasks. |
| Dataset Splits | No | The paper mentions using Atari 2600 games and LQR problems but does not provide specific details on how the datasets were split into training, validation, and test sets (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide specific details on the hardware used, such as GPU models, CPU specifications, or cloud instance types. |
| Software Dependencies | No | The paper mentions using 'Open AI gym' for Atari games and 'Adam' as an optimizer, but does not provide specific version numbers for these or other software components. |
| Experiment Setup | Yes | The hyperparameters of the learning settings are also consistent and further details are shown in Table 1. ... Selections of the hyperparameters are listed in Table 2. |