Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Real-DRL: Teach and Learn at Runtime

Authors: Yanbing Mao, Yihao Cai, Lui Sha

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments with a real quadruped robot, a quadruped robot in NVIDIA Isaac Gym, and a cart-pole system, along with comparisons and ablation studies, demonstrate the Real-DRL s effectiveness and unique features.
Researcher Affiliation Academia Yanbing Mao Engineering Technology Division Wayne State University Detroit, MI 48202 EMAIL Yihao Cai Department of Electrical and Computer Engineering Wayne State University Detroit, MI 48202 EMAIL Lui Sha Siebel School of Computing and Data Science University of Illinois Urbana-Champaign Urbana, IL 61801 EMAIL
Pseudocode Yes Real-DRL framework, which is also formally described in Algorithm 1 of Appendix B. ... This teachingto-learn mechanism is also formally described by Lines 12 to 16 in Algorithm 1 of Appendix B. ... The pseudocode for computing Fk using CVX and LMI, with a comment, is presented in Appendix E.
Open Source Code Yes The complete code and details can be found on Git Hub: https://github.com/Charlescai123/Real-DRL. ... The code is available at Git Hub: https://github.com/Charlescai123/ecvxcone.
Open Datasets No The paper describes experiments on "a real quadruped robot, a quadruped robot in NVIDIA Isaac Gym, and a cart-pole system". These are physical systems or simulation environments where data is generated through interaction, not pre-collected open datasets in the typical sense.
Dataset Splits No The paper does not use pre-collected datasets; instead, data is generated dynamically through interactions with real robots and simulation environments (NVIDIA Isaac Gym, Open AI Gym) during runtime learning. Therefore, explicit training/test/validation dataset splits are not applicable or provided.
Hardware Specification Yes For computation resources, we utilized a desktop running Ubuntu 22.04, equipped with a 12th Gen Intel(R) Core(TM) i9-12900K 16-core processor, 64 GB of RAM, and an NVIDIA Ge Force GTX 3090 GPU.
Software Dependencies No The algorithm was implemented in Python, utilizing the Tensor Flow framework alongside the Python CVXPY toolbox for solving real-time patches. ... The algorithm was implemented in Python using the Py Torch framework, and the MATLAB LMI toolbox was employed for solving real-time patches. (No specific version numbers are provided for Python, TensorFlow, PyTorch, CVXPY, or MATLAB LMI toolbox).
Experiment Setup Yes The actor and critic networks are implemented as MLPs with four fully connected layers. The output dimensions for the critic network are 256, 128, 64, and 1, while those for the actor network are 256, 128, 64, and 6. ... The discount factor γ is set to 0.9, and the learning rates for both the critic and actor networks are set at 0.0003. Finally, the batch size is configured to 512.