reproducibilityindex.ai

SimSR: Simple Distance-Based State Representations for Deep Reinforcement Learning

Authors: Hongyu Zang, Xin Li, Mingzhong Wang8997-9005

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimented to investigate the following questions: (1) In comparison with state-of-the-art algorithms, does Sim SR have a better performance in terms of sample efficiency? (2) Can Sim SR learn robust state representation? (3) How is the generalization performance of the learned representation? Accordingly, we first evaluated our method in several standard control tasks from the Deep Mind control (DMC) suite (Tassa et al. 2018). We then evaluated the robustness of our method to test if it can handle more realistic and complicated scenarios, where we replaced the background of the environment with natural video as distractors. Finally, we tested the generalization performance of the state representation in unseen tasks.
Researcher Affiliation	Academia	Hongyu Zang1, Xin Li*1, Mingzhong Wang2 1 Beijing Institute of Technology 2 University of the Sunshine Coast {zanghyu,xinli}@bit.edu.cn, mwang@usc.edu.au
Pseudocode	Yes	Algorithm 1: Sim SR algorithm
Open Source Code	Yes	Our code is available at https://github.com/bit1029public/Sim SR
Open Datasets	Yes	Accordingly, we first evaluated our method in several standard control tasks from the Deep Mind control (DMC) suite (Tassa et al. 2018). We then evaluated the robustness of our method to test if it can handle more realistic and complicated scenarios, where we replaced the background of the environment with natural video as distractors. Finally, we tested the generalization performance of the state representation in unseen tasks.
Dataset Splits	No	The paper does not explicitly provide specific train/validation/test dataset splits with percentages or sample counts.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper does not explicitly list software dependencies with specific version numbers.
Experiment Setup	Yes	We keep all hyper-parameters of the algorithm fixed throughout experiments except the action repeat which follows the convention to ensure a fair comparison. The settings of all hyper-parameters and architectures are also provided in Appendix.