Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Variance Control for Distributional Reinforcement Learning
Authors: Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance. |
| Researcher Affiliation | Academia | Qi Kuang * 1 Zhoufan Zhu * 1 Liwen Zhang 1 Fan Zhou 1 School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China. Correspondence to: Fan Zhou <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 QEMRL update algorithm |
| Open Source Code | Yes | Code is available at https://github.com/Kuangqi927/QEM |
| Open Datasets | Yes | We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance. Frozen Lake (Brockman et al., 2016) is a classic benchmark problem... |
| Dataset Splits | No | No explicit training/validation/test dataset splits (e.g., percentages, absolute counts, or specific predefined splits with citations) were found. |
| Hardware Specification | No | No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running experiments were mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library or solver names with version numbers) were provided. |
| Experiment Setup | Yes | The parameter settings used for tabular control are presented in Table 1. Our hyperparameter settings (Table 2) are aligned with Dabney et al. (2018b) for a fair comparison. Hyperparameters and environment-specific parameters are listed in Table 3. |