Deterministic training of generative autoencoders using invertible layers

Authors: Gianluigi Silvestri, Daan Roos, Luca Ambrogioni

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now show empirically that, at least in complex naturalistic datasets such as Celeb A-HQ and Image Net Karras et al. (2018); Deng et al. (2009), the exact maximum likelihood training leads to drastically improved results compared to architecturally equivalent VAEs. Our aim is to show the difference in performance between exact log-likelihood and ELBO objective functions when the architecture is kept constant. ... Table 1 shows that AEFs significantly outperform their architecturally equivalent VAEs both in terms of bits per dimension and FID score, generating significantly sharper and more detailed samples (Fig. 1, 2).
Researcher Affiliation Collaboration Gianluigi Silvestri One Planet Research Center, imec-the Netherlands Donders Institute for Brain, Cognition and Behaviour gianluigi.silvestri@imec.nl Daan Roos* UvA-Bosch Delta Lab d.f.a.roos@uva.nl Luca Ambrogioni Donders Institute for Brain, Cognition and Behaviour Radboud University l.ambrogioni@donders.ru.nl
Pseudocode Yes Algorithm 1 Negative Log Likelihood AEF on expanded ambient space: g: Encoder; f: Decoder; h: feature expansion map; n: core encoder; p0: Base distribution ( prior ); r: error distribution; θ: model parameters, γ: feature expansion parameters; x: input image
Open Source Code Yes The code for all the experiments is available at: https://github.com/gisilvs/AEF.
Open Datasets Yes To compare the generative performance of AEFs with VAEs we test on Celeb A-HQ resized to 64 64 and 32 32, and Image Net resized to 32 32. ... Additionally, we look at smaller scale datasets: MNIST, Fashion MNIST and KMNIST (Deng, 2012; Xiao et al., 2017; Clanuwat et al., 2018).
Dataset Splits Yes For all the models we use 10% of the training dataset as validation set, apart from Celeb A-HQ for which we use the predefined train-val-test split.
Hardware Specification Yes We used Azure ML and Google Colab Pro to run all our experiments. All the manifold learning and denoising experiments are run on one GPU NVIDIA M60, while we use one GPU NVIDIA K80 for the experiments on Celeb A-HQ and Image Net.
Software Dependencies No The paper mentions using and adapting implementations from (Durkan et al., 2020) for Normalizing Flows, which is 'nflows: normalizing flows in Py Torch', implying PyTorch. However, no specific version numbers for PyTorch or other software dependencies are explicitly stated in the text.
Experiment Setup Yes On the MNIST datasets we train all models for 100K iterations... For Celeb A-HQ... we train instead for 1M iterations... Image Net models are trained for 2M iterations... As optimizer we choose ADAM (Kingma & Ba, 2015) with a learning rate 1e 3 for the MNIST-like datasets, and 1e 4 for denoising experiments, Celeb A-HQ and Image Net. We use a batch size 128 for all the MNIST experiments, a batch size of 64 for Celeb A-HQ resized to 32 32 and a batch size of 16 for Celeb A-HQ resized to 64 64 and Image Net.