Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Variance Control for Distributional Reinforcement Learning

Authors: Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou

ICML 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.
Researcher Affiliation Academia Qi Kuang * 1 Zhoufan Zhu * 1 Liwen Zhang 1 Fan Zhou 1 School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China. Correspondence to: Fan Zhou <EMAIL>.
Pseudocode Yes Algorithm 1 QEMRL update algorithm
Open Source Code Yes Code is available at https://github.com/Kuangqi927/QEM
Open Datasets Yes We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance. Frozen Lake (Brockman et al., 2016) is a classic benchmark problem...
Dataset Splits No No explicit training/validation/test dataset splits (e.g., percentages, absolute counts, or specific predefined splits with citations) were found.
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running experiments were mentioned.
Software Dependencies No No specific software dependencies with version numbers (e.g., library or solver names with version numbers) were provided.
Experiment Setup Yes The parameter settings used for tabular control are presented in Table 1. Our hyperparameter settings (Table 2) are aligned with Dabney et al. (2018b) for a fair comparison. Hyperparameters and environment-specific parameters are listed in Table 3.