Constrained Diffusion with Trust Sampling

Authors: William Huang, Yifeng Jiang, Tom Van Wouwe, Karen Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the efficacy of our method through extensive experiments on complex tasks, and in drastically different domains of images and 3D motion generation, showing significant improvements over existing methods in terms of generation quality.
Researcher Affiliation Academia William Huang Stanford University willsh@stanford.edu Yifeng Jiang Stanford University yifengj@stanford.edu Tom Van Wouwe Stanford University tvwouwe@stanford.edu C. Karen Liu Stanford University karenliu@cs.stanford.edu
Pseudocode Yes Algorithm 1: Trust Sampling with DDIM
Open Source Code Yes Our implementation is available at https://github.com/will-s-h/trust-sampling.
Open Datasets Yes FFHQ 256 256 [26] and Image Net 256 256 [12]
Dataset Splits No The paper mentions '100 validation images' as the set for quantitative evaluation, which serves as a test set, but it does not specify a distinct validation split for hyperparameter tuning separate from training and testing.
Hardware Specification Yes We ran inference on an A5000 GPU, which takes roughly 1 minute to generate an image for FFHQ and 6 minutes to generate an image for Image Net, due to the larger network size. For motion tasks, the diffusion model was trained on a single A4000 GPU for approximately 24 hours.
Software Dependencies No The paper discusses using pretrained models and inference but does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Table 7: Parameters used for all experiments. Start and end refer to the start and end of the stochastic linear trust schedules.