Generalization Gap in Amortized Inference
Authors: Mingtian Zhang, Peter Hayes, David Barber
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply the reverse half-asleep to improve the generalization of VAEs on three different datasets: binary MNIST, grey MNIST [24] and CIFAR10 [23]. For binary and grey MNIST, we use latent dimension 16/32 and neural nets with 2 layers of 500 hidden units in both the encoder and decoder. We train the VAE with the usual amortized inference approach using Adam with lr = 3 × 10−4 for 1000 epochs and save the model every 100 epochs. We then use the saved models to 1) evaluate on the test data sets, 2) conduct optimal inference by training qφ(z|x) on the test data and 3) run reverse half-asleep method before calculating the test BPD. |
| Researcher Affiliation | Academia | Mingtian Zhang Peter Hayes David Barber Centre for Artificial Intelligence, University College London {m.zhang,p.hayes,d.barber}@cs.ucl.ac.uk |
| Pseudocode | Yes | Algorithm 1 Bits Back with Amortized Inference. Comp./decomp. stages share {pθ(x|z), qϕ(z|x), p(z)}. Algorithm 2 Bits Back with K-step Optimal Inference Comp./decomp. stages share {pθ(x|z), qϕ(z|x), p(z)} and the optimization procedure of Equation 25. |
| Open Source Code | Yes | Implementation can be found in the following repo: https://github.com/zmtomorrow/ Generalization Gap In Amortized Inference. |
| Open Datasets | Yes | We apply the reverse half-asleep to improve the generalization of VAEs on three different datasets: binary MNIST, grey MNIST [24] and CIFAR10 [23]. |
| Dataset Splits | No | The paper mentions using training and testing datasets (e.g., 'Xtrain', 'Xtest') and refers to training details in sections 4 and 5. However, it does not explicitly specify exact percentages or sample counts for training, validation, and test splits for reproducibility beyond implying a train/test split. |
| Hardware Specification | Yes | All experiments are run on a NVIDIA V100 GPU. |
| Software Dependencies | No | The paper mentions using 'Adam' as an optimizer and implicitly uses frameworks like PyTorch (given the nature of deep learning research) and tools like 'ANS coder'. However, it does not provide specific version numbers for any software dependencies, such as Python, PyTorch, or other libraries, required to replicate the experiments. |
| Experiment Setup | Yes | For binary and grey MNIST, we use latent dimension 16/32 and neural nets with 2 layers of 500 hidden units in both the encoder and decoder. We train the VAE with the usual amortized inference approach using Adam with lr = 3 × 10−4 for 1000 epochs and save the model every 100 epochs. ... For the reverse half-asleep, we train the amortized posterior for 100 epochs with Adam and lr = 5 × 10−4. |