Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models
Authors: Phil Chen, Mikhal Itkina, Ransalu Senanayake, Mykel J Kochenderfer
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive architectures. Our method outperforms existing dense and sparse normalization techniques in distributional accuracy. |
| Researcher Affiliation | Academia | Phil Chen, Masha Itkina, Ransalu Senanayake, Mykel J. Kochenderfer Stanford University {philhc, mitkina, ransalu, mykel}@stanford.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code for the experiments is available at https://github.com/sisl/Ev Softmax. |
| Open Datasets | Yes | We evaluate our method on a variety of generative models... on MNIST. ... on tiny Image Net [29] for image generation. ... on the English-German translation dataset from IWSLT 2014 [34]. These models were pretrained on the EN-DE corpus from WMT 2016 [35]. |
| Dataset Splits | No | The paper mentions training on datasets like MNIST, tiny ImageNet, IWSLT 2014, and WMT 2016, which have standard splits. However, it does not explicitly state the specific training, validation, or test split percentages or sample counts used for the experiments within the paper's text. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or processor types) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Open NMT Neural Machine Translation Toolkit [32]' but does not provide specific version numbers for this or any other software dependencies, libraries, or programming languages used. |
| Experiment Setup | Yes | Each model was trained for 20 epochs to minimize the ELBO of Eq. (2). (Section 4.1). Each model was trained for 200 epochs. (Section 4.2). We finetune for 40000 iterations. Hyperparameters and further experimental details are given in Appendix E. (Section 4.4) |