Structure Learning-Based Task Decomposition for Reinforcement Learning in Non-stationary Environments

Authors: Honguk Woo, Gwangpyo Yoo, Minjong Yoo8657-8665

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through various experiments, we demonstrate that our approach renders the RL agent adaptable to time-varying dynamic environment conditions, outperforming other methods including state-of-the-art non-stationary MDP algorithms.
Researcher Affiliation Academia Honguk Woo* , Gwangpyo Yoo*, Minjong Yoo* Department of Computer Science and Engineering, Sungkyunkwan University hwoo, necrocathy, mjyoo2@skku.edu
Pseudocode Yes Algorithm 1: Task decomposition embedding; Algorithm 2: Sliding windowed OGD
Open Source Code No The paper states that the model is implemented using Python, Pytorch, and Tensorflow but does not provide any specific link or explicit statement about releasing the source code for the described methodology.
Open Datasets Yes Learning Environments We build a 2-dimensional navigation environment using py Box2D (Catto 2012)... We also evaluate our approach with the minitaur environment (Tan et al. 2018)... we conduct a case study with autonomous quad-copter drones in the Airsim simulator (Shital Shah and Kapoor 2017).
Dataset Splits No The paper describes the learning environments and how the agent interacts with them but does not provide specific dataset split information (e.g., percentages or counts for training, validation, or testing sets) for reproducibility.
Hardware Specification Yes Our model is implemented using Python v3.7, Pytorch v1.8, and Tensorflow v1.14, and is trained on a system of an Intel(R) Core(TM) i9-10940X processor and an NVIDIA RTX 3090 GPU.
Software Dependencies Yes Our model is implemented using Python v3.7, Pytorch v1.8, and Tensorflow v1.14
Experiment Setup No Detailed experimental settings including hyperparameter settings and environment conditions can be found in the Appendix.