Unscented Autoencoder
Authors: Faris Janjos, Lars Rosenbaum, Maxim Dolgov, J. Marius Zoellner
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show competitive performance in Fr echet Inception Distance (FID) scores over closely-related models, in addition to a lower training variance than the VAE and We conduct rigorous experiments on several standard image datasets to compare our modifications against the VAE baseline, the closely-related Regularized Autoencoder (RAE), the Importance-Weighted Autoencoder (IWAE), as well as the Wasserstein Autoencoder (WAE). |
| Researcher Affiliation | Collaboration | 1Robert Bosch GmbH, Corporate Research, 71272 Renningen, Germany 2Research Center for Information Technology (FZI), 76131 Karlsruhe, Germany. |
| Pseudocode | No | The paper describes the mathematical formulations and components of the model but does not include a section explicitly labeled 'Pseudocode' or 'Algorithm', nor are there structured code-like blocks. |
| Open Source Code | Yes | 1Code available at: https://github.com/boschresearch/unscented-autoencoder |
| Open Datasets | Yes | We conduct rigorous experiments on several standard image datasets to compare our modifications against the VAE baseline, the closely-related Regularized Autoencoder (RAE), the Importance-Weighted Autoencoder (IWAE), as well as the Wasserstein Autoencoder (WAE)., specifically mentioning Fashion-MNIST (Xiao et al., 2017), CIFAR10 (Krizhevsky et al., 2009), and Celeb A (Liu et al., 2015). |
| Dataset Splits | Yes | For the training dataset, we use 50k out of the 60k provided examples, leaving the remaining 10k for the validation dataset. For the test dataset, we use the provided examples. In CIFAR10, we perform a random horizontal flip on the training data followed by a normalization for all dataset subsets. We use the same training/validation/test split method as in Fashion-MNIST. In Celeb A, we perform a 148 × 148 center crop and resize the images to 64 × 64. We use the provided training/validation/testing subsets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU models, CPU types, or memory specifications used for running the experiments. It only mentions the software used. |
| Software Dependencies | Yes | All models are implemented in Py Torch (Paszke et al., 2019) and use the library provided in (Seitzer, 2020) for FID computation. and from references: Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An Imperative Style, High-performance Deep Learning Library. Advances in neural information processing systems, 32, 2019. and Seitzer, M. Pytorch-fid: FID Score for Py Torch. https: //github.com/mseitzer/pytorch-fid, August 2020. Version 0.2.1. |
| Experiment Setup | Yes | All models are trained for 100 epochs, starting with a 0.005 learning rate that is then halved after every five epochs without improvement. The weights used in the loss functions are the following: KL-divergence (or the Wasserstein metric) terms are weighted with β = 2.5e-4 in the case of VAE and UAE and β = 1e-4 for the RAE. The decoder regularization terms are weighted with γ = 1e-6 for both RAE and UAE. and Network architectures are given in Tab. 4 and largely follow the architecture in (Ghosh et al., 2019). |