Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unlock the Intermittent Control Ability of Model Free Reinforcement Learning

Authors: Jiashun Liu, Jianye Hao, Xiaotian Hao, Yi Ma, YAN ZHENG, Yujing Hu, Tangjie Lv

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on simulation tasks and real-world robotic grasping tasks show that MARS significantly improves the learning efficiency and final performances compared with existing baselines.
Researcher Affiliation	Collaboration	Jiashun Liu, Jianye Hao & Xiaotian Hao College of Intelligence and Computing Tianjin University China Yi Ma School of Computer and Information Technology Shanxi University China YAN ZHENG Tianjin University China EMAIL Yujing Hu & Tangji Lv FUXI AI Laboratory Net Ease China
Pseudocode	Yes	Algorithm 1 MARS-TD3
Open Source Code	No	Our codes are implemented with Python 3.7.9 and Torch 1.7.1. All experiments were run on a single NVIDIA Ge Force GTX 3090 GPU. Each single training trial ranges from 4 hours to 17 hours, depending on the algorithms and environments. We will open source code in the near future.
Open Datasets	Yes	For robot control tasks, we select four typical openai Mujoco tasks with random interaction delays,i.e., Hopper, Ant, Walker, Half Cheetah. Mujoco is a well-known testbed and is widely used in reinforcement learning research [Brockman et al., 2016]. For navigation tasks, we used the medium and difficult maps of 2dmaze in D4RL [Fu et al., 2020].
Dataset Splits	No	The paper describes using standard environments (Mujoco, D4RL) and running multiple rounds of experiments (e.g., 15 rounds for grasping), but it does not specify explicit data splits (e.g., percentages or counts) for training, validation, or testing datasets for reproduction.
Hardware Specification	Yes	All experiments were run on a single NVIDIA Ge Force GTX 3090 GPU.
Software Dependencies	Yes	Our codes are implemented with Python 3.7.9 and Torch 1.7.1.
Experiment Setup	Yes	For all tasks, we set the dimension of zt to 8 and the scaling parameter β to 5. We set the warm-up (stage 1) step to 400000 and 100000 for the Mujoco tasks and the navigation task respectively. Detailed parameter setting can be found in appendix B.2. ... Tab.4 shows the common hyperparameters of algorithms used in all our experiments. ... Batch Size 128 Buffer Size 1e5.