Factored Adaptation for Non-Stationary Reinforcement Learning

Authors: Fan Feng, Biwei Huang, Kun Zhang, Sara Magliacane

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.
Researcher Affiliation Collaboration Fan Feng1, Biwei Huang2, Kun Zhang2,3, Sara Magliacane4,5 1City University of Hong Kong 2Carnegie Mellon University 3Mohamed bin Zayed University of Artificial Intelligence 4University of Amsterdam 5MIT-IBM Watson AI Lab
Pseudocode Yes We propose FN-VAE, a variational autoencoder architecture described in Fig. 1(b). In FN-VAE, we jointly learn the structural relationships, state transition function, reward function, and transition function of the latent change factors, as described in detail in Appendix Alg. A1.
Open Source Code Yes The implementation will be open-sourced at https://bit.ly/3er Ko Wm.
Open Datasets Yes We evaluate our approach on four well-established benchmarks, including Half-Cheetah-V3 from Mu Jo Co [20, 21], Sawyer-Reaching and Sawyer-Peg from Sawyer [22, 18], and Minitaur [23].
Dataset Splits No The paper describes online model estimation and policy optimization through interaction with environments, not fixed dataset splits for training, validation, or testing in the traditional sense. It refers to training iterations and collected trajectories, but not explicit dataset splits.
Hardware Specification Yes The information of computational resource is given in Appendix E.1. The experiments are run on Linux machines with NVIDIA A6000/V100/A100 GPUs.
Software Dependencies No The paper mentions software like MuJoCo and OpenAI Gym, but does not specify their version numbers or other ancillary software dependencies with versions required for reproduction.
Experiment Setup Yes The details on the meta-learning setups are given in Appendix D.4. Detailed function equations and experimental setups can be referred to Appendix C.