reproducibilityindex.ai

Distributional Reinforcement Learning with Monotonic Splines

Authors: Yudong Luo, Guiliang Liu, Haonan Duan, Oliver Schulte, Pascal Poupart

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments in stochastic environments show that a dense estimation for quantile functions enhances distributional RL in terms of faster empirical convergence and higher rewards in most cases.
Researcher Affiliation	Academia	Yudong Luo1,4, Guiliang Liu1,4 , Haonan Duan2,4, Oliver Schulte3, Pascal Poupart1,4 1University of Waterloo, 2University of Toronto, 3Simon Fraser University, 4Vector Institute
Pseudocode	Yes	Algorithm 1 DDPG with QR-based distributional critic (apart from FQF)... Algorithm 6 SAC with QR-based distributional critic (MM)
Open Source Code	Yes	The code for the main experiments is released in the supplementary material.
Open Datasets	Yes	Hence, in this work, we modify several robotics environments by adding stochasticity, including one discrete environment from Open AI Gym (Brockman et al., 2016) and nine continuous environments from Py Bullet Gym (Ellenberger, 2018 2019).
Dataset Splits	No	The paper describes training frames/episodes and testing procedures, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts in the way a classification or regression paper would for a static dataset.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running experiments.
Software Dependencies	No	The paper mentions software components like 'DDPG', 'SAC', 'RMSProp', but does not provide specific version numbers for any software or libraries.
Experiment Setup	Yes	Table 1: Common hyperparameters for SPL-DQN, NC-QR-DQN, NDQFN, and QR-DQN. Table 2: Common hyperparameters across SPL-DQN, QR-DQN, IQN, FQF, NC-QR-DQN, MM-DQN, and NDQFN. Table 3: Noise settings for different environments in Py Bullet Gym. Table 4: Hyperparameters for DDPG and DDPG based methods. Table 5: Hyperparameters for SAC and SAC based methods.