Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling
Authors: Đorđe Miladinović, Aleksandar Stanić, Stefan Bauer, Jürgen Schmidhuber, Joachim M. Buhmann
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | SDN is experimentally examined in two different settings. In the context of real-life-image density modeling, SDN-empowered hierarchical VAE is shown to reach considerably higher test log-likelihoods than the baseline CNN-based architectures and can synthesize perceptually appealing and coherent images even at high sampling temperatures. In a synthetic data setting, we observe that enhancing a non-hierarchical VAE with an SDN facilitates learning of factorial latent codes, suggesting that unsupervised disentanglement of representations can be bettered by using more powerful neural architectures, where SDN stands out as a good candidate model. |
| Researcher Affiliation | Academia | Ðor de Miladinovi c ETH Zurich Zürich, Switzerland Aleksandar Stani c Swiss AI Lab IDSIA, USI Lugano, Switzerland Stefan Bauer Max-Planck Institute Tübingen, Germany Jürgen Schmidhuber Swiss AI Lab IDSIA, USI Lugano, Switzerland Joachim M. Buhmann ETH Zurich Zürich, Switzerland |
| Pseudocode | Yes | Algorithm 1 Bottom-to-top sweep of the correction stage |
| Open Source Code | Yes | The accompanying source code is given at: https://github.com/djordjemila/sdn. |
| Open Datasets | Yes | Density estimation. From a set of i.i.d. images Dtrain = X1..N, the true probability density function p(X) is estimated via a parametric model pθ(X), whose parameters θ are learned using the maximum log-likelihood objective: arg maxθ h log pθ(X) 1 N PN i=1 log pθ(Xi) i . The test log-likelihood is computed on an isolated set of images Dtest = X1..K, to evaluate learned pθ(X). SDN-VAE and the competing methods were tested on CIFAR-10 (Krizhevsky et al., 2009), Image Net32 (Van Oord et al., 2016), and Celeb AHQ256 (Karras et al., 2017). Quantitative comparison is given in Table 1. |
| Dataset Splits | No | The paper provides details on training and test samples but does not explicitly provide the training/validation/test dataset splits with specific percentages or counts for a validation set. |
| Hardware Specification | Yes | GPU type Tesla V100 GPU memory 32GB (Table 3) |
| Software Dependencies | No | The paper lists various model components and optimizers (e.g., Adamax, Gaussian, DML, Mixed-precision, Weight normalization) but does not specify version numbers for programming languages, machine learning frameworks (e.g., PyTorch, TensorFlow), or other key software libraries. |
| Experiment Setup | Yes | Table 3: Experimental configurations for the density estimation tests. Includes details like Optimizer Adamax, Learning rate 0.002, Batch size per GPU 32, DML Mixture components 5, Free bits 0.01, Mixed-precision Yes, Weight normalization Yes, Horizontal flip data augmentation Yes. |