reproducibilityindex.ai

Structure Learning-Based Task Decomposition for Reinforcement Learning in Non-stationary Environments

Authors: Honguk Woo, Gwangpyo Yoo, Minjong Yoo8657-8665

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through various experiments, we demonstrate that our approach renders the RL agent adaptable to time-varying dynamic environment conditions, outperforming other methods including state-of-the-art non-stationary MDP algorithms.
Researcher Affiliation	Academia	Honguk Woo* , Gwangpyo Yoo, Minjong Yoo Department of Computer Science and Engineering, Sungkyunkwan University hwoo, necrocathy, mjyoo2@skku.edu
Pseudocode	Yes	Algorithm 1: Task decomposition embedding; Algorithm 2: Sliding windowed OGD
Open Source Code	No	The paper states that the model is implemented using Python, Pytorch, and Tensorflow but does not provide any specific link or explicit statement about releasing the source code for the described methodology.
Open Datasets	Yes	Learning Environments We build a 2-dimensional navigation environment using py Box2D (Catto 2012)... We also evaluate our approach with the minitaur environment (Tan et al. 2018)... we conduct a case study with autonomous quad-copter drones in the Airsim simulator (Shital Shah and Kapoor 2017).
Dataset Splits	No	The paper describes the learning environments and how the agent interacts with them but does not provide specific dataset split information (e.g., percentages or counts for training, validation, or testing sets) for reproducibility.
Hardware Specification	Yes	Our model is implemented using Python v3.7, Pytorch v1.8, and Tensorflow v1.14, and is trained on a system of an Intel(R) Core(TM) i9-10940X processor and an NVIDIA RTX 3090 GPU.
Software Dependencies	Yes	Our model is implemented using Python v3.7, Pytorch v1.8, and Tensorflow v1.14
Experiment Setup	No	Detailed experimental settings including hyperparameter settings and environment conditions can be found in the Appendix.