Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling

Authors: Đorđe Miladinović, Aleksandar Stanić, Stefan Bauer, Jürgen Schmidhuber, Joachim M. Buhmann

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental SDN is experimentally examined in two different settings. In the context of real-life-image density modeling, SDN-empowered hierarchical VAE is shown to reach considerably higher test log-likelihoods than the baseline CNN-based architectures and can synthesize perceptually appealing and coherent images even at high sampling temperatures. In a synthetic data setting, we observe that enhancing a non-hierarchical VAE with an SDN facilitates learning of factorial latent codes, suggesting that unsupervised disentanglement of representations can be bettered by using more powerful neural architectures, where SDN stands out as a good candidate model.
Researcher Affiliation Academia Ðor de Miladinovi c ETH Zurich Zürich, Switzerland Aleksandar Stani c Swiss AI Lab IDSIA, USI Lugano, Switzerland Stefan Bauer Max-Planck Institute Tübingen, Germany Jürgen Schmidhuber Swiss AI Lab IDSIA, USI Lugano, Switzerland Joachim M. Buhmann ETH Zurich Zürich, Switzerland
Pseudocode Yes Algorithm 1 Bottom-to-top sweep of the correction stage
Open Source Code Yes The accompanying source code is given at: https://github.com/djordjemila/sdn.
Open Datasets Yes Density estimation. From a set of i.i.d. images Dtrain = X1..N, the true probability density function p(X) is estimated via a parametric model pθ(X), whose parameters θ are learned using the maximum log-likelihood objective: arg maxθ h log pθ(X) 1 N PN i=1 log pθ(Xi) i . The test log-likelihood is computed on an isolated set of images Dtest = X1..K, to evaluate learned pθ(X). SDN-VAE and the competing methods were tested on CIFAR-10 (Krizhevsky et al., 2009), Image Net32 (Van Oord et al., 2016), and Celeb AHQ256 (Karras et al., 2017). Quantitative comparison is given in Table 1.
Dataset Splits No The paper provides details on training and test samples but does not explicitly provide the training/validation/test dataset splits with specific percentages or counts for a validation set.
Hardware Specification Yes GPU type Tesla V100 GPU memory 32GB (Table 3)
Software Dependencies No The paper lists various model components and optimizers (e.g., Adamax, Gaussian, DML, Mixed-precision, Weight normalization) but does not specify version numbers for programming languages, machine learning frameworks (e.g., PyTorch, TensorFlow), or other key software libraries.
Experiment Setup Yes Table 3: Experimental configurations for the density estimation tests. Includes details like Optimizer Adamax, Learning rate 0.002, Batch size per GPU 32, DML Mixture components 5, Free bits 0.01, Mixed-precision Yes, Weight normalization Yes, Horizontal flip data augmentation Yes.