What does guidance do? A fine-grained analysis in a simple setting

Authors: Muthu Chidambaram, Khashayar Gatmiry, Sitan Chen, Holden Lee, Jianfeng Lu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In addition to verifying these results empirically in synthetic settings, we also show how our theoretical insights can offer useful prescriptions for practical deployment. 3 Experiments Here we empirically verify the guidance dynamics predicted by Theorems 1 and 2.
Researcher Affiliation Academia Muthu Chidambaram Duke University muthu@cs.duke.edu Khashayar Gatmiry MIT gatmiry@mit.edu Sitan Chen Harvard University sitan@seas.harvard.edu Holden Lee Johns Hopkins University hlee283@jhu.edu Jianfeng Lu Duke University jianfeng@math.duke.edu
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes We include code in the supplementary material for recreating the numerical experiments in the paper.
Open Datasets Yes Simpler image datasets are known to be close to linearly separable in particular, MNIST. To conduct experiments on Image Net, we use the classifier-guided Image Net models available from [13].
Dataset Splits No The paper mentions training models (e.g., 'training the guidance model of [28] for 40 epochs' for MNIST), but it does not explicitly specify the training/validation/test dataset splits used for these models, nor does it describe new training with such splits. It often uses pre-existing models or models trained by external works.
Hardware Specification Yes All experiments in this section were conducted on a single A5000 GPU.
Software Dependencies No The paper mentions using 'Jax [2]' and 'Py Torch [27]' for experiments but does not provide specific version numbers for these software dependencies in the main text.
Experiment Setup Yes For solving, we use 1000 evaluation steps and take T = 10... For sampling, we use DDPM [18] with 400 time steps and a linear noise schedule, and we found that training the guidance model of [28] for 40 epochs was sufficient to generate high quality samples. For sampling, we use DDIM [32] with 25 steps.