Self-Predictive Dynamics for Generalization of Vision-based Reinforcement Learning

Authors: Kyungsoo Kim, Jeongsoo Ha, Yusung Kim

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In a set of Mu Jo Co visual control tasks and an autonomous driving task (CARLA), SPD outperforms previous studies in complex observations, and significantly improves the generalization performance for unseen observations. Our code is available at https://github.com/unigary/SPD.
Researcher Affiliation Academia Kyungsoo Kim , Jeongsoo Ha and Yusung Kim Sungkyunkwan University {unigary, hjg1210}@g.skku.edu, yskim525@skku.edu
Pseudocode Yes Algorithm 1 Self-Predictive Dynamics
Open Source Code Yes Our code is available at https://github.com/unigary/SPD.
Open Datasets Yes For evaluation, we used a set of continuous control tasks (the Deep Mind Control suite [Tassa et al., 2018]) with distracting elements backgrounds as proposed in [Zhang et al., 2020]. In an autonomous driving task, CARLA [Dosovitskiy et al., 2017], our method achieves the best performance on complex observations containing a lot of task-irrelevant information in realistic driving scenes.
Dataset Splits No The paper describes training on different backgrounds (Simple Distractor, Natural Video) and testing for generalization across them. However, it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or predefined partition citations.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory used for running its experiments.
Software Dependencies No The paper mentions 'Implementation details and hyper parameters are in the supplementary material.' but does not list specific software dependencies with version numbers in the provided main text.
Experiment Setup No The paper states 'Implementation details and hyper parameters are in the supplementary material.' but does not provide concrete experimental setup details, such as specific hyperparameter values or training configurations, in the main text.