Rewiring Neurons in Non-Stationary Environments
Authors: Zhicheng Sun, Yadong Mu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed method is comprehensively evaluated on 18 continual reinforcement learning scenarios ranging from locomotion to manipulation, demonstrating its advantages over state-of-the-art competitors in performance-efficiency tradeoffs. Code is available at https://github.com/feifeiobama/Rewire Neuron. |
| Researcher Affiliation | Academia | Zhicheng Sun, Yadong Mu Peking University, Beijing, China {sunzc,myd}@pku.edu.cn |
| Pseudocode | No | The paper describes its methods verbally and with figures, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/feifeiobama/Rewire Neuron. |
| Open Datasets | Yes | Environments. We use 18 continual reinforcement learning scenarios from Brax and Continual World: (1) Brax [18, 20] contains 9 locomotion scenarios over 3 domains: Half Cheetah, Ant and Humanoid. (2) Continual World [69] is a manipulation benchmark built on Meta-World [73] and Mu Jo Co [65], featuring 8 scenarios with 3 tasks (CW3) and one scenario with 10 tasks (CW10), both with a varying reward function and a budget of 1M interactions per task. More details are provided in Appendix A.1. |
| Dataset Splits | No | The paper does not provide explicit numerical or proportional splits (e.g., train/validation/test percentages or counts) for datasets used in the experiments. It mentions |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for experiments. |
| Software Dependencies | No | We build on the Sa Lin A library [12] and adopt Soft Actor-Critic (SAC) [25] with autotuned temperature [26] as the underlying algorithm. Both the actor and the critic are 4-layer perceptions with 256 hidden neurons per layer, while the actor also includes task-specific heads [69]. Their training configurations follow [20]. |
| Experiment Setup | Yes | Implementation details. We build on the Sa Lin A library [12] and adopt Soft Actor-Critic (SAC) [25] with autotuned temperature [26] as the underlying algorithm. Both the actor and the critic are 4-layer perceptions with 256 hidden neurons per layer, while the actor also includes task-specific heads [69]. Their training configurations follow [20]. For our method, we choose the new hyperparameters K, α, and β via grid search for each scenario, and provide a sensitivity analysis in Section 4.3. The score vectors in Eq. (4) are initialized with an arithmetic sequence rescaled to [0, 1], and the temperature is τ = 1 by default. |