An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay

Authors: Scott Fujimoto, David Meger, Doina Precup

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate both LAP and PAL on the suite of Mu Jo Co environments [3] and a set of Atari games [4]. Across both domains, we find both of our methods outperform the vanilla algorithms they modify. In the Mu Jo Co domain, we find significant gains over the state-of-the-art in the hardest task, Humanoid. All code is open-sourced (https://github.com/sfujim/LAP-PAL).
Researcher Affiliation Academia Scott Fujimoto, David Meger, Doina Precup Mila, Mc Gill University scott.fujimoto@mail.mcgill.ca
Pseudocode No The paper describes the algorithms mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes All code is open-sourced (https://github.com/sfujim/LAP-PAL).
Open Datasets Yes We evaluate the benefits of LAP and PAL on the standard suite of Mu Jo Co [3] continuous control tasks as well as a subset of Atari games, both interfaced through Open AI gym [41].
Dataset Splits No The paper does not provide explicit details about training, validation, or test dataset splits, percentages, or sample counts.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or detailed computer specifications used for running experiments.
Software Dependencies No The paper mentions software like Open AI gym and specific algorithms, but does not provide version numbers for any software dependencies.
Experiment Setup No A complete list of hyper-parameters and experimental details are provided in the supplementary material.