Variational Autoencoder with Implicit Optimal Priors
Authors: Hiroshi Takahashi, Tomoharu Iwata, Yuki Yamanaka, Masanori Yamada, Satoshi Yagi5066-5073
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various datasets show that the VAE with this implicit optimal prior achieves high density estimation performance. |
| Researcher Affiliation | Industry | Hiroshi Takahashi,1 Tomoharu Iwata,2 Yuki Yamanaka,3 Masanori Yamada,3 Satoshi Yagi1 1 NTT Software Innovation Center 2 NTT Communication Science Laboratories 3 NTT Secure Platform Laboratories {takahashi.hiroshi, iwata.tomoharu, yamanaka.yuki, yamada.m, yagi.satoshi}@lab.ntt.co.jp |
| Pseudocode | Yes | Algorithm 1 shows the pseudo code of the optimization procedure of this model, where K is the minibatch size of SGD. |
| Open Source Code | No | The paper does not provide concrete access to its source code, nor does it explicitly state that its code is being released. |
| Open Datasets | Yes | We used five datasets: One Hot (Mescheder, Nowozin, and Geiger 2017), MNIST (Salakhutdinov and Murray 2008), OMNIGLOT (Burda, Grosse, and Salakhutdinov 2015), Frey Faces3, and Histopathology (Tomczak and Welling 2016). |
| Dataset Splits | Yes | Table 1: Number and dimensions of datasets Dimension Train size Valid size Test size. MNIST 784 50,000 10,000 10,000 |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Adam (Kingma and Ba 2014)' as an optimizer, but does not provide specific ancillary software details with version numbers for libraries or full software environments. |
| Experiment Setup | Yes | We used two-layer neural networks (500 hidden units per layer) for the encoder, the decoder, and the density ratio estimator. We trained all methods by using Adam (Kingma and Ba 2014) with a mini-batch size of 100 and learning rate in 10 4, 10 3. We set the maximum number of epochs to 1,000 and used earlystopping (Goodfellow, Bengio, and Courville 2016) on the basis of validation data. We set the sample size of the reparameterization trick to L = 1. In addition, we used warmup (Bowman et al. 2015) for the first 100 epochs of Adam. With our approach, we used dropout (Srivastava et al. 2014) in the training of the density ratio estimator since it is likely to over-fit. We set the keep probability of dropout to 50%. We updated the parameter of the density ratio estimator: ψ for 10 epochs during the updating of the parameters of VAE: θ and φ for one epoch. We set the sampling size of Monte Carlo approximation in Eq. (20) to M = N. |