Hybrid Models with Deep and Invertible Features
Authors: Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now report experimental findings for a range of regression and classification tasks. Unless otherwise stated, we used the Glow architecture (Kingma & Dhariwal, 2018) to define the DIGLM s invertible transform and factorized standard Gaussian distributions as the latent prior p(z). |
| Researcher Affiliation | Industry | Eric Nalisnick * 1 Akihiro Matsukawa * 1 Yee Whye Teh 1 Dilan Gorur 1 Balaji Lakshminarayanan 1 1Deep Mind. Correspondence to: Balaji Lakshminarayanan <balajiln@google.com>. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., repository link, explicit statement of code release) for its source code. |
| Open Datasets | Yes | Next we evaluate the model on a large-scale regression task using the flight delay data set (Hensman et al., 2013). ... Moving on to classification, we train a DIGLM on MNIST ... We move on to natural images, performing a similar evaluation on SVHN. ... We use CIFAR-10 for the OOD set. |
| Dataset Splits | No | While the paper mentions tuning hyperparameters on a 'validation set' for MNIST and semi-supervised learning, it does not provide specific details on the size, percentage, or method used for creating these validation splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components and methods like 'Glow architecture', 'Adam optimizer', 'batch normalization', and 'dropout', but it does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, or library versions). |
| Experiment Setup | Yes | We train using Adam optimizer for 10 epochs with learning rate 10 3 and batch size 100. ... Optimization was done via Adam (Kingma & Ba, 2014) with a 10 4 initial learning rate for 100k steps, then decayed by half at iterations 800k and 900k. ... We use a larger network of 24 Glow blocks and employ multi-scale factoring (Dinh et al., 2017) every 8 blocks. We use a larger Highway network containing 300 hidden units. |