Deep Reinforcement Learning amidst Continual Structured Non-Stationarity
Authors: Annie Xie, James Harrison, Chelsea Finn
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experimental evaluation, we find that our method far outperforms RL algorithms that do not account for environment non-stationary, handles extrapolating environment shifts, and retains strong performance in stationary settings. |
| Researcher Affiliation | Academia | Annie Xie 1 James Harrison 1 Chelsea Finn 1 1Stanford University. Correspondence to: Annie Xie <anniexie@stanford.edu>. |
| Pseudocode | Yes | Algorithm 1 Lifelong Latent Actor-Critic (LILAC) |
| Open Source Code | No | No explicit statement or link providing concrete access to the source code for the methodology described in this paper. |
| Open Datasets | Yes | The first is derived from the simulated Sawyer reaching task in the Meta-World benchmark (Yu et al., 2019), in which the target position is not observed and moves between episodes. In the second environment based on Half Cheetah from Open AI Gym (Brockman et al., 2016)... We next consider the 8-Do F minitaur environment (Tan et al., 2018)... |
| Dataset Splits | No | No specific details on training, validation, and test dataset splits are provided (e.g., percentages or sample counts). |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are provided for the experimental setup. |
| Software Dependencies | No | The paper mentions software components like Open AI Gym and Meta-World but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We tune the hyperparameters for all approaches, and run each with the best hyperparameter setting with 3 random seeds. For all hyperparameter details, see Appendix B. |