Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls
Authors: Jeongho Kim, Jaeuk Shin, Insoon Yang
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically demonstrate the performance of our method through benchmark tasks and high-dimensional linear-quadratic problems. In this section, we present the empirical performance of our method on benchmark tasks as well as high-dimensional LQ problems. |
| Researcher Affiliation | Academia | Jeongho Kim EMAIL Institute of New Media and Communications Seoul National University Seoul 08826, South Korea Jaeuk Shin EMAIL Department of Electrical and Computer Engineering Automation and Systems Research Institute Seoul National University Seoul 08826, South Korea Insoon Yang EMAIL Department of Electrical and Computer Engineering Automation and Systems Research Institute Seoul National University Seoul 08826, South Korea |
| Pseudocode | Yes | Algorithm 1: Hamilton Jacobi DQN |
| Open Source Code | Yes | The source code of our HJ DQN implementation is available online.8 8. https://github.com/HJDQN/HJQ |
| Open Datasets | Yes | We evaluate our algorithm on Open AI benchmark tasks and high-dimensional linear-quadratic (LQ) control problems. We consider continuous control benchmark tasks in Open AI gym (Brockman et al., 2016) simulated by Mu Jo Co engine (Todorov et al., 2012). |
| Dataset Splits | No | The paper evaluates performance on benchmark tasks like Open AI gym and LQ problems, which generate data through interaction rather than using pre-split datasets. It mentions running experiments for "1 million steps" and "5,000 episodes" for evaluation, but does not provide specific training/test/validation splits of a fixed dataset. |
| Hardware Specification | Yes | All the simulations in Section 5 were conducted using Python 3.7.4 on a PC with Intel Core i9-9900X @ 3.50GHz, NVIDIA Ge Force RTX 2080 Ti, and 64GB RAM. |
| Software Dependencies | No | Appendix D states: 'All the simulations in Section 5 were conducted using Python 3.7.4 on a PC with Intel Core i9-9900X @ 3.50GHz, NVIDIA Ge Force RTX 2080 Ti, and 64GB RAM.' It specifies the Python version but does not provide version numbers for any deep learning frameworks (e.g., PyTorch, TensorFlow) or other key libraries used in the implementation. |
| Experiment Setup | Yes | Table 1: Hyperparameters for HJ DQN. ... Table 2: Hyperparameters for DDPG. These tables list specific values for learning rate, Lipschitz constant, sampling interval, discount, replay buffer size, target smoothing coefficient, noise coefficient, number of hidden layers and units, samples per minibatch, and nonlinearity. |