SimSR: Simple Distance-Based State Representations for Deep Reinforcement Learning
Authors: Hongyu Zang, Xin Li, Mingzhong Wang8997-9005
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimented to investigate the following questions: (1) In comparison with state-of-the-art algorithms, does Sim SR have a better performance in terms of sample efficiency? (2) Can Sim SR learn robust state representation? (3) How is the generalization performance of the learned representation? Accordingly, we first evaluated our method in several standard control tasks from the Deep Mind control (DMC) suite (Tassa et al. 2018). We then evaluated the robustness of our method to test if it can handle more realistic and complicated scenarios, where we replaced the background of the environment with natural video as distractors. Finally, we tested the generalization performance of the state representation in unseen tasks. |
| Researcher Affiliation | Academia | Hongyu Zang1, Xin Li*1, Mingzhong Wang2 1 Beijing Institute of Technology 2 University of the Sunshine Coast {zanghyu,xinli}@bit.edu.cn, mwang@usc.edu.au |
| Pseudocode | Yes | Algorithm 1: Sim SR algorithm |
| Open Source Code | Yes | Our code is available at https://github.com/bit1029public/Sim SR |
| Open Datasets | Yes | Accordingly, we first evaluated our method in several standard control tasks from the Deep Mind control (DMC) suite (Tassa et al. 2018). We then evaluated the robustness of our method to test if it can handle more realistic and complicated scenarios, where we replaced the background of the environment with natural video as distractors. Finally, we tested the generalization performance of the state representation in unseen tasks. |
| Dataset Splits | No | The paper does not explicitly provide specific train/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper does not explicitly list software dependencies with specific version numbers. |
| Experiment Setup | Yes | We keep all hyper-parameters of the algorithm fixed throughout experiments except the action repeat which follows the convention to ensure a fair comparison. The settings of all hyper-parameters and architectures are also provided in Appendix. |