On Variational Bounds of Mutual Information
Authors: Ben Poole, Sherjil Ozair, Aaron Van Den Oord, Alex Alemi, George Tucker
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On high-dimensional, controlled problems, we empirically characterize the bias and variance of the bounds and their gradients and demonstrate the effectiveness of our new bounds for estimation and representation learning. |
| Researcher Affiliation | Collaboration | 1Google Brain 2MILA 3Deep Mind. Correspondence to: Ben Poole <pooleb@google.com>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes algorithms textually but not in a formal pseudocode format. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described, nor does it include a specific repository link or an explicit code release statement. |
| Open Datasets | Yes | Finally, we highlight the utility of these bounds for disentangled representation learning on the d Sprites datasets (Matthey et al., 2017). [...] d Sprites (Matthey et al., 2017). |
| Dataset Splits | No | The paper discusses 'batch size' and 'minibatches' for training but does not explicitly provide specific dataset split information like exact percentages, sample counts for train/validation/test sets, or citations to predefined splits for reproduction. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9) needed to replicate the experiment. |
| Experiment Setup | Yes | The single-sample unnormalized critic estimates of MI exhibit high variance, and are challenging to tune for even these problems. [...] While INCE is a poor estimator of MI with the small training batch size of 64, the interpolated bounds are able to provide less biased estimates than INCE with less variance than INWJ. [...] We use the convolutional encoder architecture from Burgess et al. (2018); Locatello et al. (2018) for p(y|x), and a two hidden layer fully-connected neural network to parameterize the unnormalized variational marginal q(y) used by IJS. |