reproducibilityindex.ai

Imitating Human Behaviour with Diffusion Models

Authors: Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally, diffusion models closely match human demonstrations in a simulated robotic control task and a modern 3D gaming environment.
Researcher Affiliation	Industry	Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin Microsoft Research
Pseudocode	Yes	Appendix D: SAMPLING ALGORITHMS includes 'Algorithm 1 Sampling for Diffusion BC', 'Algorithm 2 Sampling for Diffusion-X', and 'Algorithm 3 Sampling for Diffusion-KDE'.
Open Source Code	Yes	Code: https://github.com/microsoft/Imitating-Human-Behaviour-w-Diffusion.
Open Datasets	No	No, the paper describes the datasets used ('The demonstration dataset contains 566 trajectories' for Kitchen, 'The demonstration dataset contains 45,000 observation/action tuples' for CSGO) and cites their origin (e.g., 'Gupta et al., 2020' for Kitchen, 'Pearce and Zhu, 2022' for CSGO environment), but does not provide specific URLs, DOIs, repository names, or explicit statements within this paper confirming public availability of these specific datasets for download.
Dataset Splits	No	No, the paper refers to 'demonstration dataset' and evaluates models by 'rollouts' (e.g., 'roll out 100 trajectories of length 280 for evaluation'), but does not specify explicit training, validation, and test dataset splits with percentages, counts, or references to predefined splits.
Hardware Specification	Yes	we were able to roll out our diffusion models at 8Hz on an average gaming GPU (NVIDIA GTX 1060 Mobile). ... Kitchen environment, V100 GPU ... CSGO environment, Res Net18 observation encoder, V100 GPU
Software Dependencies	No	No, the paper mentions software components such as 'scikit-learn' and 'torch.nograd()', but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	MLP models used a learning rate of 1e-3 and batchsize of 512, while transformer models used a learning rate of 5e-4 and batchsize of 1024. We set K = 64 for K-means and discretised used 20 bins per action dimension. For diffusion models, we set T = 50 and standard β schedules linearly decaying in [1e-4, 0.02].