reproducibilityindex.ai

Rolling Diffusion Models

Authors: David Ruhe, Jonathan Heek, Tim Salimans, Emiel Hoogeboom

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we show that when the temporal dynamics are complex, Rolling Diffusion is superior to standard diffusion. In particular, this result is demonstrated in a video prediction task using the Kinetics-600 video dataset and in a chaotic fluid dynamics forecasting experiment.
Researcher Affiliation	Collaboration	1Google Deepmind, Amsterdam, Netherlands 2University of Amsterdam, Netherlands.
Pseudocode	Yes	Algorithm 1 Rolling Diffusion: Training; Algorithm 2 Rolling Diffusion: Rollout
Open Source Code	No	The paper does not include an unambiguous statement about releasing code for the described methodology or provide a direct link to a source-code repository.
Open Datasets	Yes	video prediction task using the Kinetics-600 video dataset (Kay et al., 2017) and in an experiment involving chaotic fluid mechanics simulations. ... BAIR robot pushing dataset (Ebert et al., 2017) is a standard benchmark for video prediction.
Dataset Splits	No	The paper mentions datasets like BAIR and Kinetics-600, which are standard benchmarks, and refers to 'evaluation sets' or 'test-time' setups. However, it does not explicitly provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or explicit references to predefined splits by name) for reproduction.
Hardware Specification	No	The paper does not specify the exact hardware used for experiments, such as specific GPU or CPU models, or detailed cloud computing resources.
Software Dependencies	No	The paper refers to frameworks like Jax CFD and Simple Diffusion architecture but does not provide specific version numbers for any software dependencies, libraries, or solvers used in the experiments.
Experiment Setup	Yes	Appendix C. Hyperparameters. Throughout the experiments we use U-Vi Ts which are essentially U-Nets with MLP Blocks instead of convolutional layers when self-attention is used in a block. ... Parameter Value Blocks [3 + 3, 3 + 3, 3 + 3, 8] Channels [128, 256, 512, 1024] Head Dim 128 Dropout [0, 0.1, 0.1, 0.1] ... learning rate 1e-4