reproducibilityindex.ai

Rewiring Neurons in Non-Stationary Environments

Authors: Zhicheng Sun, Yadong Mu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our proposed method is comprehensively evaluated on 18 continual reinforcement learning scenarios ranging from locomotion to manipulation, demonstrating its advantages over state-of-the-art competitors in performance-efficiency tradeoffs. Code is available at https://github.com/feifeiobama/Rewire Neuron.
Researcher Affiliation	Academia	Zhicheng Sun, Yadong Mu Peking University, Beijing, China {sunzc,myd}@pku.edu.cn
Pseudocode	No	The paper describes its methods verbally and with figures, but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/feifeiobama/Rewire Neuron.
Open Datasets	Yes	Environments. We use 18 continual reinforcement learning scenarios from Brax and Continual World: (1) Brax [18, 20] contains 9 locomotion scenarios over 3 domains: Half Cheetah, Ant and Humanoid. (2) Continual World [69] is a manipulation benchmark built on Meta-World [73] and Mu Jo Co [65], featuring 8 scenarios with 3 tasks (CW3) and one scenario with 10 tasks (CW10), both with a varying reward function and a budget of 1M interactions per task. More details are provided in Appendix A.1.
Dataset Splits	No	The paper does not provide explicit numerical or proportional splits (e.g., train/validation/test percentages or counts) for datasets used in the experiments. It mentions
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for experiments.
Software Dependencies	No	We build on the Sa Lin A library [12] and adopt Soft Actor-Critic (SAC) [25] with autotuned temperature [26] as the underlying algorithm. Both the actor and the critic are 4-layer perceptions with 256 hidden neurons per layer, while the actor also includes task-specific heads [69]. Their training configurations follow [20].
Experiment Setup	Yes	Implementation details. We build on the Sa Lin A library [12] and adopt Soft Actor-Critic (SAC) [25] with autotuned temperature [26] as the underlying algorithm. Both the actor and the critic are 4-layer perceptions with 256 hidden neurons per layer, while the actor also includes task-specific heads [69]. Their training configurations follow [20]. For our method, we choose the new hyperparameters K, α, and β via grid search for each scenario, and provide a sensitivity analysis in Section 4.3. The score vectors in Eq. (4) are initialized with an arithmetic sequence rescaled to [0, 1], and the temperature is τ = 1 by default.