Unscented Autoencoder

Authors: Faris Janjos, Lars Rosenbaum, Maxim Dolgov, J. Marius Zoellner

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show competitive performance in Fr echet Inception Distance (FID) scores over closely-related models, in addition to a lower training variance than the VAE and We conduct rigorous experiments on several standard image datasets to compare our modifications against the VAE baseline, the closely-related Regularized Autoencoder (RAE), the Importance-Weighted Autoencoder (IWAE), as well as the Wasserstein Autoencoder (WAE).
Researcher Affiliation Collaboration 1Robert Bosch GmbH, Corporate Research, 71272 Renningen, Germany 2Research Center for Information Technology (FZI), 76131 Karlsruhe, Germany.
Pseudocode No The paper describes the mathematical formulations and components of the model but does not include a section explicitly labeled 'Pseudocode' or 'Algorithm', nor are there structured code-like blocks.
Open Source Code Yes 1Code available at: https://github.com/boschresearch/unscented-autoencoder
Open Datasets Yes We conduct rigorous experiments on several standard image datasets to compare our modifications against the VAE baseline, the closely-related Regularized Autoencoder (RAE), the Importance-Weighted Autoencoder (IWAE), as well as the Wasserstein Autoencoder (WAE)., specifically mentioning Fashion-MNIST (Xiao et al., 2017), CIFAR10 (Krizhevsky et al., 2009), and Celeb A (Liu et al., 2015).
Dataset Splits Yes For the training dataset, we use 50k out of the 60k provided examples, leaving the remaining 10k for the validation dataset. For the test dataset, we use the provided examples. In CIFAR10, we perform a random horizontal flip on the training data followed by a normalization for all dataset subsets. We use the same training/validation/test split method as in Fashion-MNIST. In Celeb A, we perform a 148 × 148 center crop and resize the images to 64 × 64. We use the provided training/validation/testing subsets.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU models, CPU types, or memory specifications used for running the experiments. It only mentions the software used.
Software Dependencies Yes All models are implemented in Py Torch (Paszke et al., 2019) and use the library provided in (Seitzer, 2020) for FID computation. and from references: Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An Imperative Style, High-performance Deep Learning Library. Advances in neural information processing systems, 32, 2019. and Seitzer, M. Pytorch-fid: FID Score for Py Torch. https: //github.com/mseitzer/pytorch-fid, August 2020. Version 0.2.1.
Experiment Setup Yes All models are trained for 100 epochs, starting with a 0.005 learning rate that is then halved after every five epochs without improvement. The weights used in the loss functions are the following: KL-divergence (or the Wasserstein metric) terms are weighted with β = 2.5e-4 in the case of VAE and UAE and β = 1e-4 for the RAE. The decoder regularization terms are weighted with γ = 1e-6 for both RAE and UAE. and Network architectures are given in Tab. 4 and largely follow the architecture in (Ghosh et al., 2019).