Variance Control for Distributional Reinforcement Learning

Authors: Qi Kuang, Zhoufan Zhu, Liwen Zhang, Fan Zhou

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.
Researcher Affiliation Academia Qi Kuang * 1 Zhoufan Zhu * 1 Liwen Zhang 1 Fan Zhou 1 School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China. Correspondence to: Fan Zhou <zhoufan@mail.shufe.edu.cn>.
Pseudocode Yes Algorithm 1 QEMRL update algorithm
Open Source Code Yes Code is available at https://github.com/Kuangqi927/QEM
Open Datasets Yes We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance. Frozen Lake (Brockman et al., 2016) is a classic benchmark problem...
Dataset Splits No No explicit training/validation/test dataset splits (e.g., percentages, absolute counts, or specific predefined splits with citations) were found.
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running experiments were mentioned.
Software Dependencies No No specific software dependencies with version numbers (e.g., library or solver names with version numbers) were provided.
Experiment Setup Yes The parameter settings used for tabular control are presented in Table 1. Our hyperparameter settings (Table 2) are aligned with Dabney et al. (2018b) for a fair comparison. Hyperparameters and environment-specific parameters are listed in Table 3.