reproducibilityindex.ai

Visual Transfer For Reinforcement Learning Via Wasserstein Domain Confusion

Authors: Josh Roy, George D. Konidaris9454-9462

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental Results We validate our novel Wasserstein Confusion loss term and WAPPO algorithm on 17 environments: Visual Cartpole and both the easy and hard versions of 16 Open AI Procgen environments.
Researcher Affiliation	Academia	Josh Roy and George Konidaris Brown University joshnroy@gmail.com, gdk@cs.brown.edu
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code	No	For further details, please see the appendix and the code 1 distributed with Cobbe et al. (2019a). (footnote 1 points to https://github.com/openai/procgen). This points to the code of the baseline (PPO), not explicitly the specific WAPPO implementation or modifications made by the authors for their method.
Open Datasets	Yes	We validate our novel Wasserstein Confusion loss term and WAPPO algorithm on 17 environments: Visual Cartpole and both the easy and hard versions of 16 Open AI Procgen environments.
Dataset Splits	No	The paper states: 'For each environment evaluated, the agent trains using WAPPO with full access to the source domain and a buffer of 5000 observations from the target domain.' However, it does not specify explicit training/validation/test dataset splits or cross-validation settings typically associated with model validation.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions using a 'PPO implementation' and 'Leaky Re LU activations' but does not provide specific version numbers for software dependencies like Python, deep learning frameworks, or libraries.
Experiment Setup	Yes	We utilize the PPO implementation and hyperparameters provided with (Cobbe et al. 2019a). We use these same hyperparameters for the other methods tested and do not perform any hyperparameter searches.