Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Improving Variational Autoencoder Estimation from Incomplete Data with Mixture Variational Families
Authors: Vaidotas Simkus, Michael U. Gutmann
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed methods for VAE estimation on synthetic and realistic data sets with missing data (section 6). |
| Researcher Affiliation | Academia | Vaidotas Simkus EMAIL Michael U. Gutmann EMAIL School of Informatics University of Edinburgh |
| Pseudocode | Yes | Algorithm 1 Shared computation of the De Miss VAE learning objectives |
| Open Source Code | Yes | The methods are summarised in table 1 and the code implementation is available at https://github.com/ vsimkus/demiss-vae. |
| Open Datasets | Yes | We here evaluate the proposed methods on real-world data sets from the UCI repository (Dua & Graff, 2017; Papamakarios et al., 2017). |
| Dataset Splits | No | The paper mentions evaluating on a 'complete test data set' and a '20K sample data set used to fit the VAEs' but does not provide specific split percentages or counts (e.g., train/validation/test splits). The missingness percentages (e.g., 20/50/80%) refer to data incompleteness, not dataset splits for training and evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU models, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper mentions the 'AMSGrad optimiser (Reddi et al., 2018)' and 'STL gradients (Roeder et al., 2017)' which are algorithms/techniques, but it does not specify software libraries with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | We then fitted a VAE model with 2-dimensional latent space using diagonal Gaussian encoder and decoder distributions, and a fixed standard Normal prior. For the decoder and encoder networks we used fullyconnected residual neural networks with 3 residual blocks, 200 hidden dimensions, and Re LU activations. To optimise the model parameters we have used AMSGrad optimiser (Reddi et al., 2018) with a learning rate of 10 3 for a total of 500 epochs. |