Hierarchical Semi-Implicit Variational Inference with Application to Diffusion Model Acceleration
Authors: Longlin Yu, Tianyu Xie, Yu Zhu, Tong Yang, Xiangyu Zhang, Cheng Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we demonstrate the effectiveness of HSIVI on both Bayesian inference tasks with complicated target distributions and diffusion model acceleration.When used for diffusion model acceleration, we show that HSIVI can produce high quality samples comparable to or better than the existing fast diffusion model based samplers with a small number of function evaluations on various datasets. |
| Researcher Affiliation | Collaboration | Longlin Yu1, , Tianyu Xie1, , Yu Zhu3,4, , Tong Yang5, Xiangyu Zhang5, Cheng Zhang1,2, 1 School of Mathematical Sciences, Peking University 2 Center for Statistical Science, Peking University 3 Institute of Automation, Chinese Academy of Sciences 4 Beijing Academy of Artificial Intelligence 5 Megvii Technology Inc. |
| Pseudocode | Yes | Algorithm 1 Hierarchical semi-implicit variational inference (sequential training) and Algorithm 2 Hierarchical semi-implicit variational inference (joint training) |
| Open Source Code | Yes | The code is available at https://github.com/longin Yu/HSIVI. |
| Open Datasets | Yes | We test four synthetic 2D datasets: Checkerboard, Circles, Moons, and Swissroll (Pedregosa et al., 2011). MNIST, CIFAR-10, Celeb A & Image Net. We take the pretrained noise model for CIFAR10 and Image Net seperately from https://github.com/tqch/ddpm-torch/releases/download/ checkpoints/cifar10_2040.pt and https://openaipublic.blob.core.windows.net/ diffusion/march-2021/imagenet64_uncond_100M_1500K.pt. |
| Dataset Splits | No | The paper uses well-known public datasets (e.g., MNIST, CIFAR-10) that typically have predefined splits. However, the paper does not explicitly state the specific train/validation/test split percentages, sample counts, or refer to a specific predefined split with a citation for its experiments. |
| Hardware Specification | Yes | Experiments need about 1.5 days on CIFAR-10, need about 3 days on Celeb A and 4 days on Image Net using 8 Nvidia 2080 Ti GPUs. |
| Software Dependencies | No | The paper mentions using Adam optimizer and a PyTorch implementation of UNet, but it does not specify concrete software versions (e.g., PyTorch version, Python version, or specific library versions). |
| Experiment Setup | Yes | We set the learning rate of variational parameters ϕt (or ϕ) to 0.001 and the learning rate of ψt (or ψ) to 0.002 in both SIVI and HSIVI. For HSIVI-LB and HSIVI-SM, we run 80000 variational parameter updates for every conditional layer; for SIVI-LB and SIVI-SM, we run 5 × 80000 variational parameter updates. All the algorithms are trained with a batch size of 64. |