A New Distribution on the Simplex with Auto-Encoding Applications
Authors: Andrew Stirn, Tony Jebara, David Knowles
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the distribution s utility in a variety of semi-supervised auto-encoding tasks. In all cases, the resulting models achieve competitive performance commensurate with their simplicity, use of explicit probability models, and abstinence from adversarial training. |
| Researcher Affiliation | Collaboration | Department of Computer Science Columbia University New York, NY 10027 {andrew.stirn,jebara,daknowles}@cs.columbia.edu jointly affiliated with New York Genome Center jointly affiliated with Spotify Technology S.A. jointly affiliated with Columbia University s Data Science Institute and the New York Genome Center |
| Pseudocode | Yes | Algorithm 1 A Generalized Stick-Breaking Process |
| Open Source Code | Yes | Our source code can be found at https://github.com/astirn/MV-Kumaraswamy. |
| Open Datasets | Yes | We utilize the Tensor Flow Datasets API, from which we source our data. For all experiments, we split our data into 4 subsets: unlabeled training (U) data, labeled training (L) data, validation data, and test data. For MNIST: |U| = 49, 400, |L| = 600, |validation| = |test| = 10, 0000. For SVHN: |U| = 62, 257, |L| = 1000, |validation| = 10, 000, |test| = 26, 032. |
| Dataset Splits | Yes | For all experiments, we split our data into 4 subsets: unlabeled training (U) data, labeled training (L) data, validation data, and test data. For MNIST: |U| = 49, 400, |L| = 600, |validation| = |test| = 10, 0000. For SVHN: |U| = 62, 257, |L| = 1000, |validation| = 10, 000, |test| = 26, 032. |
| Hardware Specification | No | The paper states 'We utilized GPU acceleration and found that cards with 8 GB of memory were sufficient,' but does not specify the exact GPU models (e.g., NVIDIA A100, RTX 2080 Ti) or CPU details used for the experiments. |
| Software Dependencies | No | The paper mentions 'Tensor Flow' and 'ADAM' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Our models are implemented in Tensor Flow and were trained using ADAM with a batch size B = 250 and 5 Monte-Carlo samples for each training example. We use learning rates 1 10 3 and 1 10 4 respectively for MNIST and SVHN. Other optimizer parameters were kept at Tensor Flow defaults. |