A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs

Authors: Fabian Falck, Christopher Williams, Dominic Danks, George Deligiannidis, Christopher Yau, Chris C Holmes, Arnaud Doucet, Matthew Willetts

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the following we probe the theoretical understanding of HVAEs gained through our framework, demonstrating its utility in four experimental analyses: (a) Improving parameter efficiency in HVAEs, (b) Time representation in HVAEs and how they make use of it, (c) Sampling instabilities in HVAEs, and (d) Ablation studies. We train HVAEs using VDVAE [9] as the basis model on five datasets: MNIST [42], CIFAR10 [43], two downsampled versions of Image Net [44, 45], and Celeb A [46], splitting each into a training, validation and test set (see Appendix D for details). In general, reported numeric values refer to Negative Log-Likelihood (NLL) in nats (MNIST) or bits per dim (all other datasets) on the test set at model convergence, if not stated otherwise.
Researcher Affiliation Academia Fabian Falck 1,3,4 Christopher Williams ,1 Dominic Danks 2,4 George Deligiannidis 1 Christopher Yau 1,3,4 Chris Holmes 1,3,4 Arnaud Doucet 1 Matthew Willetts 4 1University of Oxford 2University of Birmingham 3Health Data Research UK 4The Alan Turing Institute
Pseudocode No The paper describes methods in text and uses figures to illustrate concepts, but does not contain a pseudocode block or an explicitly labeled algorithm.
Open Source Code Yes We provide our Py Torch code base at https://github.com/Fabian Falck/unet-vdvae (see Appendix C for details).
Open Datasets Yes We train HVAEs using VDVAE [9] as the basis model on five datasets: MNIST [42], CIFAR10 [43], two downsampled versions of Image Net [44, 45], and Celeb A [46], splitting each into a training, validation and test set (see Appendix D for details).
Dataset Splits Yes We train HVAEs using VDVAE [9] as the basis model on five datasets: MNIST [42], CIFAR10 [43], two downsampled versions of Image Net [44, 45], and Celeb A [46], splitting each into a training, validation and test set (see Appendix D for details).
Hardware Specification Yes Due to the significant computational cost of training extremely deep HVAEs (multiple Nvidia A100 graphic cards with 40GB of GPU memory each running for 3 weeks per run)
Software Dependencies Yes We used the following software packages: PyTorch [56] (1.10.1+cu113), NumPy [57] (1.21.5), WandB [58] (0.12.9), Apex [59] (21.8), Python [60] (3.8.10), Matplotlib [61] (3.5.1), Imageio [62] (2.13.5), mpi4py [63] (3.1.3), scikit-learn [64] (1.0.2), Pillow [65] (9.0.0).
Experiment Setup Yes We train HVAEs using VDVAE [9] as the basis model on five datasets: MNIST [42], CIFAR10 [43], two downsampled versions of Image Net [44, 45], and Celeb A [46], splitting each into a training, validation and test set (see Appendix D for details). and We train VDVAE closely following the state-of-the-art hyperparameter configurations in [9], specifically with the same number of parameterised blocks and without weight-sharing (VDVAE ), and compare them against models with weight-sharing (WS-VDVAE) and fewer parameters, i.e. fewer parameterised blocks, in Table 1.