Scalable Adaptive Computation for Iterative Generation

Authors: Allan Jabri, David J. Fleet, Ting Chen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments with diffusion models show that RINs outperform U-Net architectures for image and video generation, as shown in Figure 1.
Researcher Affiliation Collaboration 1Google Brain, Toronto. 2Department of EECS, UC Berkeley. Correspondence to: Ting Chen <iamtingchen@google.com>, Allan Jabri <ajabri@berkeley.edu>.
Pseudocode Yes Algorithm 3 RINs Implementation Pseudo-code.
Open Source Code No The paper does not provide an explicit statement or link to its open-source code.
Open Datasets Yes For image generation, we mainly use the Image Net dataset (Russakovsky et al., 2015). [...] We also use CIFAR-10 (Krizhevsky et al.) to show the model can be trained with small datasets. [...] For video prediction, we use the Kinetics-600 dataset (Carreira et al., 2018) at 16 × 64 × 64 resolution.
Dataset Splits No The paper does not explicitly state the training, validation, and test dataset splits, only mentioning evaluation on 50K samples for some metrics.
Hardware Specification Yes We train most models on 32 TPUv3 chips with a batch size of 1024. Models for 512 × 512 and 1024 × 1024 are trained on 64 TPUv3 chips and 256 TPUv4 chips, respectively.
Software Dependencies No The paper mentions using 'major deep learning frameworks, such as Tensorflow (Abadi et al., 2016) and Py Torch (Paszke et al., 2019)' but does not specify their version numbers.
Experiment Setup Yes Table C.2. Training Hyper-parameters. (Updates, Batch Size, LR, LR-decay, Optim β2, Weight Dec., Self-cond. Rate, EMA β)