Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
Authors: Wanpeng Zhang, Yilin Li, Boyu Yang, Zongqing Lu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results further demonstrate the superior performance of COREP over existing methods in tackling non-stationarity problems. and 4. Experiments |
| Researcher Affiliation | Academia | 1School of Computer Science, Peking University 2Center for Statistical Science, Peking University 3School of Data Science, Fudan University 4Beijing Academy of Artificial Intelligence. |
| Pseudocode | Yes | The detailed steps of COREP are outlined in Algorithm C.1. |
| Open Source Code | Yes | The code is available at https://github.com/PKURL/COREP. |
| Open Datasets | Yes | The experiments are conducted on various environments from the Deep Mind Control Suite (Tassa et al., 2018), which is a widely used benchmark for RL algorithms. |
| Dataset Splits | No | The paper mentions collecting trajectories and updating replay buffers but does not explicitly state specific train, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | Yes | CPU Intel I9-12900K@3.2GHz (24 Cores) GPU Nvidia RTX 3090 (24GB) 2 RAM 256GB |
| Software Dependencies | No | The paper lists software libraries used such as 'Numpy', 'Py Torch', 'Py Torch Geometric', 'Deep Mind Control', and 'Open AI Gym' but does not specify their version numbers. |
| Experiment Setup | Yes | We list the hyperparameters for MLP, GAT, and VAE structures in Table C.1, and the hyperparameters for policy optimization and training in Table C.2. |