Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Diffusion-based Curriculum Reinforcement Learning

Authors: Erdi Sayar, Giovanni Iacca, Ozgur S. Oguz, Alois Knoll

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of Di Cu RL in three different maze environments and two robotic manipulation tasks simulated in Mu Jo Co, where it outperforms or matches nine state-of-the-art CRL algorithms from the literature.
Researcher Affiliation Academia Technical University of Munich1 University of Trento2 Bilkent University3
Pseudocode Yes Algorithm 1 Diffusion Curriculum Goal Generator Algorithm 2 RL Training and Evaluation
Open Source Code Yes Our codebase is available at: https://github.com/erdiphd/Di Cu RL/
Open Datasets Yes To evaluate our proposed method, we conducted experiments across three maze environments simulated in Mu Jo Co3: Point UMaze, Point NMaze, and Point Spiral Maze. ... 3Details available at https://robotics.farama.org/envs/maze/point_maze/.
Dataset Splits No The paper specifies training and testing details but does not explicitly mention or describe distinct validation dataset splits in the manner of supervised learning. It discusses 'test rollouts' for evaluation, but not a separate validation set split.
Hardware Specification Yes We conducted our experiments on a cluster computer using an NVIDIA RTX A5000 GPU, 64GB of RAM, and a 4-core CPU.
Software Dependencies No The paper mentions converting code from 'Tensor Flow to Py Torch' and that their diffusion model is 'based on Py Torch' but does not specify exact version numbers for these or any other key software components.
Experiment Setup Yes All our algorithms hyperparameters used in the experiments are reported in Table 3.