reproducibilityindex.ai

Posterior Sampling for Deep Reinforcement Learning

Authors: Remo Sasso, Michelangelo Conserva, Paulo Rauber

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the Atari benchmark show that PSDRL significantly outperforms previous state-of-the-art attempts at scaling up posterior sampling while being competitive with a state-of-the-art (model-based) reinforcement learning method, both in sample efficiency and computational efficiency.
Researcher Affiliation	Academia	1School of Electronic Engineering and Computer Science, Queen Mary University of London, United Kingdom. Correspondence to: Remo Sasso <r.sasso@qmul.ac.uk>.
Pseudocode	Yes	Algorithm 1 summarizes PSDRL (forward models are sampled every m time steps instead of episodically).
Open Source Code	Yes	The source code for replicating all experiments is available as supplementary material. ... The implementation for PSDRL can be found at https://github.com/remosasso/PSDRL.
Open Datasets	Yes	We provide an experimental comparison between PSDRL and other algorithms on 55 Atari 2600 games that are commonly used in the literature (Mnih et al., 2015).
Dataset Splits	No	The paper does not explicitly provide specific train/validation/test dataset splits with percentages, sample counts, or references to predefined splits in the typical supervised learning sense. It describes evaluation episodes and environment steps.
Hardware Specification	Yes	We make use of an NVIDIA A100 GPU for training.
Software Dependencies	No	The paper mentions training with the 'Adam optimizer (Kingma & Ba, 2015)' but does not provide specific version numbers for software libraries, programming languages, or other tools used (e.g., PyTorch 1.9, Python 3.8).
Experiment Setup	Yes	Table 3 presents the hyperparameters for PSDRL, the search sets used for grid search, and the resulting values used for the experiments.