Why do Variational Autoencoders Really Promote Disentanglement?

Authors: Pratik Bhowal, Achint Soni, Sirisha Rambhatla

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Complementary to our theoretical contributions, our experimental results corroborate our analysis. Code is available at https://github.com/criticalml-uw/ Disentanglement-in-VAE.In this section, we discuss the experimental setup and the results to verify our theoretical findings. We experimentally verify how introducing local non-linearity makes the VAE modeling more realistic.
Researcher Affiliation Collaboration 1NVIDIA, India 2Department of Computer Science, University of Waterloo, Ontario, Canada 3Department of Management Science and Engineering, University of Waterloo, Ontario, Canada.
Pseudocode No The paper contains mathematical proofs and lemmas, but no clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/criticalml-uw/ Disentanglement-in-VAE.
Open Datasets Yes We study the VAE architectures using four widely used datasets, namely, d Sprites, 3D Faces (Paysan et al., 2009), 3D shapes (Burgess & Kim, 2018), and MPI 3D complex real-world shapes dataset (Gondal et al., 2019).
Dataset Splits No A validation set x(i) Xval is defined. For each x(i), g D and MD are estimated as neural networks, considering that these are local approximations unique to each x(i). (11) is employed to train these networks, as detailed in Sect. 4.4.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No Table 3 mentions 'Adam' as an optimizer with a learning rate, but no specific software or library versions (e.g., Python, PyTorch, TensorFlow, scikit-learn) are provided to ensure reproducibility.
Experiment Setup Yes Table 3. The architectures of the VAE-based models used for the different datasets. Dataset Component Architecture... Encoder Conv [32, 4, 2, 1] (Re LU), [32, 4, 2, 1] (Re LU), [64, 4, 2, 1] (Re LU), [64, 4, 2, 1] (BN) (Re LU), [256, 4, 1, 0] (BN) (Re LU), [Latent Space, 1, 1] (Re LU) Decoder Conv[64, 1, 1, 0] (Relu), Conv Trans [64, 4, 1, 0] (Re LU), [64, 4, 2, 1] (Re LU), [64, 4, 2, 1] (Re LU), [32, 4, 2, 1] (Re LU), [32, 4, 2, 1] (Re LU), [3, 4, 2, 1] β 6 Optimizer Adam (lr = 10 3, betas = (0.9, 0.999))