reproducibilityindex.ai

BeigeMaps: Behavioral Eigenmaps for Reinforcement Learning from Images

Authors: Sandesh Adhikary, Anqi Li, Byron Boots

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate that when added as a drop-in modification, Beige Maps improve the policy performance of prior behavioral distance based RL algorithms. and We train and evaluate all algorithms on the Deep Mind Control (DMC) suite (Tassa et al., 2018)
Researcher Affiliation	Collaboration	Sandesh Adhikary 1 Anqi Li 1 2 Byron Boots 1 1Computer Science and Engineering, University of Washington, Seattle, WA (USA) 2NVIDIA; Work done while AL was affiliated with the University of Washington. Correspondence to: Sandesh Adhikary <adhikary@cs.washington.edu>.
Pseudocode	Yes	Algorithm 1 Behavioral Distance Representation Learning
Open Source Code	No	The paper does not provide any explicit statement about releasing the source code for the described methodology, nor does it include a link to a code repository.
Open Datasets	Yes	We train and evaluate all algorithms on the Deep Mind Control (DMC) suite (Tassa et al., 2018), a set of continuous control tasks that has been used as a benchmark for all prior behavioral distance algorithms.
Dataset Splits	No	The paper mentions evaluating on "random evaluation seeds" and "random training seeds" but does not specify explicit percentages or counts for training, validation, or test dataset splits. It also mentions "All environments are truncated at 1000 steps and have dense rewards bounded between [0, 1], except for ball in cup catch which has sparse binary rewards." but this is about environment setup, not data splitting.
Hardware Specification	No	All models were trained using the Hyak computing cluster at the University of Washington. Each model was trained on a single GPU, which was assigned by the cluster’s scheduling system." (No specific GPU model, CPU, or other hardware details are provided.)
Software Dependencies	No	The paper mentions using a "Docker container" and libraries like "Soft Actor Critic (SAC)" and "rliable library", but it does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	In Table 3, we list all model hyperparameter choices the only difference from Zhang et al. (2022) is that we use a smaller replay buffer size (100, 000 instead of 1M) due to computational constraints." (Table 3 provides detailed hyperparameter values such as Batch Size 128, Discount γ 0.99, various learning rates, etc.)