reproducibilityindex.ai

SeRO: Self-Supervised Reinforcement Learning for Recovery from Out-of-Distribution Situations

Authors: Chan Kim, Jaekyung Cho, Christophe Bobda, Seung-Woo Seo, Seong-Woo Kim

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our in-depth experimental results demonstrate that our method substantially improves the agent s ability to recover from OOD situations in terms of sample efficiency and restoration of the performance for the original tasks. We conducted experiments on four Open AI gym s Mu Jo Co environments to answer the above questions.
Researcher Affiliation	Academia	Chan Kim1 , Jaekyung Cho1 , Christophe Bobda2 , Seung-Woo Seo1 and Seong-Woo Kim1 1Seoul National University 2University of Florida {chan kim, jackyoung96, sseo, snwoo}@snu.ac.kr, cbobda@ece.ufl.edu
Pseudocode	No	A detailed explanation of the overall retraining procedure can be found in the supplementary material.
Open Source Code	Yes	Code and supplementary materials are available at https://github.com/SNUChan Kim/Se RO.
Open Datasets	Yes	We used Half Cheetah-v2, Hopper-v2, Walker2D-v2, and Ant-v2 from the gym s Mu Jo Co environments [Brockman et al., 2016].
Dataset Splits	No	The paper describes training and retraining phases in simulation environments but does not provide specific dataset split information (e.g., percentages or sample counts) for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Open AI gym s Mu Jo Co environments' and 'SAC' and 'Python' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	We first trained the agents in the training environments for 1 million steps using SAC