reproducibilityindex.ai

Diffusion Imitation from Observation

Authors: Bo-Ruei Huang, Chun-Kai Yang, Chun-Mao Lai, Dai-Jie Wu, Shao-Hua Sun

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our method DIFO to various existing Lf O methods in various continuous control domains, including navigation, locomotion, manipulation, and games. The experimental results show that DIFO consistently exhibits superior performance.
Researcher Affiliation	Academia	Bo-Ruei Huang Chun-Kai Yang Chun-Mao Lai Dai-Jie Wu Shao-Hua Sun Department of Electrical Engineering, National Taiwan University
Pseudocode	Yes	A Pseudocode of DIFO Algorithm 1 Diffusion Imitation from Observation (DIFO)
Open Source Code	No	Answer: [No] Justification: We plan to release code, expert datasets, and models recently.
Open Datasets	Yes	We collect 60 demonstrations (36 000 transitions) using a controller from Fu et al. [14]. We use 100 demonstrations (7000 transitions) from Minari [50]. Table 2: Expert observations. Detailed information on collected expert observations in each Task. POINTMAZE 60 36 000 D4RL [14] ANTMAZE 100 7000 Minari [50]
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits with specific percentages or counts. It mentions collecting expert demonstrations and performing online interactions, but not how these are formally split for training, validation, or testing.
Hardware Specification	Yes	Table 6: Computational resources. Workstation 1 Intel Xeon w7-2475X NVIDIA Ge Force RTX 4090 x 2 125 Gi B Workstation 2 Intel Xeon w5-2455X NVIDIA RTX A6000 x 2 125 Gi B Workstation 3 Intel Xeon W-2255 NVIDIA Ge Force RTX 4070 Ti x 2 125 Gi B Workstation 4 Intel Xeon W-2255 NVIDIA Ge Force RTX 4070 Ti x 2 125 Gi B
Software Dependencies	No	The paper mentions key software components such as 'Imitation', 'Stable Baselines3', and the 'diffusers package', and 'Adam' as optimizer, but does not provide specific version numbers for these components required for a reproducible setup.
Experiment Setup	Yes	Table 4: Hyperparameters. The overview of the hyperparameters used for all the methods in every task. We abbreviate 'Discriminator' as 'Disc.' in this table. Table 5: SAC & PPO training parameters.