Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Authors: Gonçalo Correia, Vlad Niculae, Wilker Aziz, André Martins

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We report successful results in three tasks covering a range of latent variable modeling applications: a semisupervised deep generative model, a latent communication game, and a generative model with a bit-vector latent representation. and 5 Experimental Analysis We next demonstrate the applicability of our proposed strategies by tackling three tasks: a deep generative model with semisupervision ( 5.1), an emergent communication two-player game over a discrete channel ( 5.2), and a variational autoencoder with latent binary factors ( 5.3).
Researcher Affiliation Collaboration Gonçalo M. Correiaä goncalo.correia@lx.it.pt Vlad Niculaeæ vlad@vene.ro Wilker Azizå w.aziz@uva.nl André F. T. Martinsä Èã andre.t.martins@tecnico.ulisboa.pt äInstituto de Telecomunicações, Lisbon, Portugal ÈLUMLIS (Lisbon ELLIS Unit), Instituto Superior Técnico, Lisbon, Portugal ãUnbabel, Lisbon, Portugal åILLC, University of Amsterdam, The Netherlands æIv I, University of Amsterdam, The Netherlands
Pseudocode No The paper describes algorithms and procedures in text (e.g., "The active set algorithm for Sparse MAP"), and Appendix B details "The Active Set Algorithm for Sparse MAP", but it does not present structured pseudocode or a formally labeled algorithm block.
Open Source Code Yes Code is publicly available at https://github.com/deep-spin/sparse-marginalization-lvm
Open Datasets Yes Data and architecture. We evaluate this model on the MNIST dataset [31], using 10% of labeled data, treating the remaining data as unlabeled. and We use Fashion-MNIST [42], consisting of 256-level grayscale images x {0, 1, . . . , 255}28 28.
Dataset Splits No The paper mentions using 10% labeled data for the semisupervised VAE task, and discusses training epochs, but does not provide specific train/validation/test dataset splits (percentages or counts) for reproducibility.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions using PyTorch [62] for implementation, but it does not specify version numbers for PyTorch or any other software dependencies needed to reproduce the experiments.
Experiment Setup Yes We describe any further architecture and hyperparameter details in App. E. and within the experimental sections, details like For top-k sparsemax, we choose k = 10. and we used b = D 2 are mentioned. Appendix E details include: Each model was trained for 200 epochs. and All methods are trained for 500 epochs.