Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning
Authors: Wenzhen Huang, Qiyue Yin, Junge Zhang, Kaiqi Huang7848-7856
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate that our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks. |
| Researcher Affiliation | Academia | 1 School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 2 CRISE, Institute of Automation, Chinese Academy of Sciences, Beijing, China 3 CAS Center for Excellence in Brain Science and Intelligence Technology, Beijing, China |
| Pseudocode | Yes | Algorithm 1 Reweighted Probabilistic-Ensemble Soft Actor-Critic (Re W-PE-SAC) |
| Open Source Code | No | The paper mentions a PyTorch implementation for SAC baseline, but there is no explicit statement or link indicating that the authors' own code for the proposed method is open-source or available. |
| Open Datasets | Yes | We evaluate our algorithm on six complex continuous control tasks from the model-based RL benchmark (Wang et al. 2019), which is modified from the Open AI gym benchmark suite (Brockman et al. 2016). |
| Dataset Splits | No | The paper describes using a replay buffer for training but does not specify explicit training, validation, or test dataset splits with percentages or sample counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions a PyTorch implementation for a baseline SAC, but it does not provide specific version numbers for general software dependencies (e.g., Python, PyTorch) required to replicate the experiment. |
| Experiment Setup | Yes | The network architecture and training hyperparameters are given in the appendix. |