Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion
Authors: Taehyun Cho, Seungyub Han, Heesoo Lee, Kyungjae Lee, Jungwoo Lee
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we empirically show that our method outperforms other existing distribution-based algorithms in various environments including Atari 55 games. |
| Researcher Affiliation | Collaboration | Taehyun Cho1, Seungyub Han1,3, Heesoo Lee1, Kyungjae Lee2, Jungwoo Lee1 1 Seoul National University, 2 Chung-Ang University, 3 Hodoo AI Labs |
| Pseudocode | Yes | Algorithm 1 Perturbed QR-DQN (PQR) |
| Open Source Code | No | The paper references third-party codebases like DQN Zoo and Dopamine for comparisons, but does not provide an explicit statement or link for the open-source code of their own proposed method (PQR). |
| Open Datasets | Yes | Finally, we empirically show that our method outperforms other existing distribution-based algorithms in various environments including Atari 55 games. |
| Dataset Splits | No | The paper mentions using standard benchmark environments like Atari and N-Chain but does not explicitly provide training, validation, and test split percentages or sample counts within the text. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are provided in the paper. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | Table 2: Table of hyperparameter setting |